William Trump and Paul Hately consider what is really achievable by using big data

Predictive analytics is nothing new, and has been playing a role in life insurance for over a decade. Life insurers have tried to emulate what banks and some general insurers were already doing with data. These companies use what they know about customers, such as purchase data, to derive a competitive advantage - for example by pre-approving customers for specific products, or by better targeting certain messages. One example of usage in life insurance is the analysis of banking and life insurance data from bancassurers to find new 'predictors' of health. These are then used to pre-select a group of customers for a targeted offer of life insurance with minimal underwriting.
Today, amid the hype around the 'big data' revolution, it's worth taking a step back, and revisiting the past 10 years to reflect on the lessons already learned about how to use big data to inform underwriting and other business initiatives.
Start with the end in mind - know what you want to predict
This may sound obvious, but it is surprising how many people want to be 'doing big data' without having a clear business purpose in mind. They should start by asking "What are we seeking to achieve?". The goal can vary hugely from one market to another: for one insurer, it may be that underwriting simplification is key, whereas another may find that it is retention that's the real pain point.
The reason it's so crucial to define the aim is that it will hugely influence both what data will be needed, and how the analytics will be run. Some examples include:
reducing the length of the underwriting process for healthy customers - in which case, predictive underwriting methods apply;
differentiating on price based on the risk profile of the individual - typically done based on past claims data;
aiming to achieve higher conversion rates on sales campaigns - where propensity-to-buy modelling or trigger-event marketing can
be deployed;
seeking to improve retention, where we use past data to build a propensity-to-lapse model - either to target better customer prospects at the point of sale, or to target retention efforts within the in-force book.
Be realistic about limitations
Unfortunately, there are still a lot of wild promises about what is achievable through big data. The real world is far more constrained and some fairly strict criteria apply,particularly when looking for health predictors. The strongest models are those built from scratch, on a bespoke basis. By definition, this means sufficient high-quality past data is needed.
Bancassurers have invested heavily in getting their data to the point where it is an asset. As a consequence, they are now exceptionally well placed to take advantage of it. Typically, their wealth of past sales data - including underwriting decisions and claims - means that, when matched to banking data, unique insights can be learned by applying statistical techniques to the anonymised data-file. But even with banks, there can be vast data differences, whether it's quantity of data or quality - such as the number of variables available per customer.
If you're not a bank, it's still worth doing something
Not having perfect data is an easy excuse for not doing anything at all. This is a real mistake, for, as Eric Siegler puts it in his very helpful book, Predictive Analytics: The Power to Predict Who Will Click, Buy, Lie, or Die, "a little bit of prediction goes a long way".
Five years ago, people would ask whether they should be doing this type of analytics, whereas now they are asking how they can do this. This is a good sign.
Even insurers with relatively weak data can - and should - be using it to make some improvement to their processes. The important thing is to ask what can be done - rather than what can't - with what is accessible.
A key question is about how different types of data-sources can be used to predict health, purchase, or lapse. Table 1 shows our analysis.
Data doesn't get you all the way
There is sometimes the tendency for those working in predictive analytics to believe that, given the right data, perfect predictions can be made. Unfortunately, this is not the case. Models make mistakes and customers don't always behave the way they are expected to. There will always be the customer with a low propensity to lapse who cancels their policy.
Analytics can help improve the process, by removing redundant health questions for healthy customers, for example. However, strong sales methods and messages are still needed. Reliable results can only be expected when good analytics come together with strong sales or retention processes.
The growing body of evidence emerging from the field of behavioural economics is a helpful reminder that we are not as fully rational as we think we are - and as we claim to be. This should encourage taking a 'test and learn' approach to everything we do. Only then will the true drivers of customer behaviour be determined.
Lots of people talk about it - but few actually do it
It's easy to lose count of the number of industry events where the presenter has spoken about Google and Amazon and what they are doing with their data, but the life insurance industry is still very slow to react in actually making use of the data it has access to. Experience so far has proven that much can be achieved through matching life insurance data with other descriptive data in order to predict health, purchase or lapse. Some insurers have led the way on this, and hopefully many more will follow.