Is machine learning a threat to the underwriting practice? Reza Hekmat and Balint Bone consider this possibility and its challenges

In 2016, award-winning computer scientist Geoffrey Hinton stated: “We should stop training radiologists now. It’s just completely obvious that within five years, deep learning is going to do better than radiologists.”
Almost six years later, the UK has a record number of radiologist vacancies. Nevertheless, machine learning and its computer vision subclass have made huge strides since 2016. If Hinton were to repeat his quote today, it would not raise nearly as many eyebrows. Many in the insurance industry have begun to ask whether machines could now replace underwriters.
The current state of play
The protection industry has embraced digitalisation during the past few decades. No longer do underwriters underwrite every single policy – in fact, they don’t ‘underwrite’ any policies at all in
the literal sense. The application process is now mostly digital, made up of a mix of rule-based and manual underwriting, with more than 70% of applications receiving an instant underwriting decision.
However, the industry still carries the stigma of being outdated and manual due to the remaining 30% of applications that require manual underwriting, GP reports or further investigations, which can sometimes take up to several weeks.
Advances in medical science and statistical tools have transformed underwriting over the years, making it far more sophisticated in categorising applications and identifying risky policies. While this has resulted in cheaper premiums for the healthiest customers, it has also lengthened the application form and increased the price for the least healthy customers.
With more technological advances and the emergence of ‘now culture’, the industry is facing added pressure to offer shorter application processes and faster underwriting decisions for all customers. At the same time, online comparison tools have made competitive pricing more important than ever before. These opposing goals are difficult, if not impossible, to reconcile.
The wider adoption of machine learning opens avenues that have not been explored before. The possibility of using big data and artificial intelligence to meet customers’ needs, without exposure to significantly more risk, could be on the horizon.
Recent developments
Machine learning has been around for decades, and its underlying technologies are now embedded in our homes, phones, cars, smartwatches and more. Several recent developments and trends mean that it could now potentially be used for underwriting protection policies.
Neural networks (a class of machine learning inspired by the structure and functions of the human brain) and other deep learning techniques have become far more efficient at processing the large and complex data required to accurately predict mortality and morbidity risks. For their 2019 paper ‘Using machine learning to model claims experience and reporting delays for pricing and reserving’, Louis Rossouw and Ronald Richman applied several machine learning models, including deep learning, to a large life insurance dataset containing claims and exposures to predict mortality rates. This produced promising results for the overall population, but risk profile prediction on an individual basis is a significantly bigger challenge. Similarly, significant improvements have been made to natural language processing, a class of machine learning that is concerned with understanding and processing human language. Natural language programming could prove crucial in automating the analysis of medical notes and GP reports, which have historically been written by hand.
Another development is the surging popularity of activity trackers, precipitated by people’s increasing focus on their individual health and fitness. What started as simple step-tracking devices have evolved into sophisticated devices that can monitor and record key health metrics such as heart rate, blood pressure and blood oxygen. In some cases, they can even generate an electrocardiogram. Combining these data points could be enough to give a fair assessment of the person’s current state of health.
Other data sources relevant to customers’ mortality and morbidity risks are also becoming more accessible. Some are directly linked to the customer’s state of health, such as NHS data, GP reports and health records. Through digitalisation, this data is becoming more accessible to both customers and insurers, presenting the opportunity for health services to predict and prevent illnesses, as well as for insurers to build predictive underwriting models.
“We can’t ‘outsource’ the underwriting decision to models and machines if the workings are not always obvious or even inspectable”
Other data sources are less obviously linked to health but could complement those previously listed. These sources include banking data (accessible through open banking standards) and data available through other smart Internet of Things devices.
Increased data availability, and the improved efficacy of machine learning techniques in handling complex data and processing natural languages, mean we are on the horizon of being able to drastically simplify the underwriting journey – with limited or no impact on the risk profile. Combining a simple questionnaire with external data sources should enable insurers to accurately assess risk and make instant decisions in almost all cases.
While the foundations are being put into place for a completely predictive underwriting process, many of the benefits could be realised early by developing predictive models for the most manual part of the underwriting. Running predictive models alongside the standard procedure would enable insurers and reinsurers to develop and train models, and gain confidence, without impacting normal business.
While a fully autonomous underwriting journey would remove the need for regular manual underwriting, it would not eliminate the need for underwriters. Like many occupations, the underwriter’s role will evolve and change – from underwriting individual cases to designing, developing and refining predictive underwriting models.
Benefits and challenges
Implementing predictive underwriting could have many benefits other than creating a shorter and quicker on-boarding journey, all of which may grow the protection market, which has stalled for a few years. These benefits include:
- Efficient processes – These are likely to lead to lower acquisition costs. The savings can be passed on to customers.
- The use of external verified sources such as health records – This would lead to lower non-disclosures, the benefits of which are twofold: more accurate pricing and more efficient reserving. It would also improve the claims process and experience for the customer and the insurer – key to improving trust in the insurance industry.
- Considering additional data sources such as activity levels could improve the accuracy of risk categorisation, giving protection to customers who are currently perceived as too risky to be insured.
However, there are still major challenges to be addressed before machines can dictate underwriting decisions.
The first of these is the fact that we can’t trust models just yet. This is not just because models are not yet 100% accurate, but also because, as an industry, we have a moral obligation to act fairly towards all potential customers. We can’t ‘outsource’ the underwriting decision to models and machines if the workings are not always obvious or even inspectable, meaning the decisions are not easily explainable. It would be difficult to rate or decline customers if we cannot completely understand the underlying reason for doing so. A model could decline customers due to features that are completely irrelevant to customer’s health, as shown by Marco Tullio Ribeiro, Sameer Singh and Carlos Guestrin. This group of researchers developed a model with a significantly high level of accuracy to identify wolves and huskies, only to learn that the decision was based on a single factor – whether snow was observed in the background.
Machine learning models will also pick up inequalities within the data and exacerbate them. Each data source is bound to have a level of inherent bias, leading to embedded bias within the models and thus potential discrimination and reduced accuracy. Biases can also result in mismatching of risks and incorrect premiums, which can have catastrophic consequences.
The dangers of de-pooling risks
Insurance is about pooling risks, whether low, medium or high. Categorising customers into sub-groups will result in a de-pooling of risks and can have significant impacts on the industry, as seen to some extent in the introduction of preferred lives in the US.
Machine learning could classify applications into several groups that range from low risk
to high risk, and pricing will consequently follow. This will inevitably price out higher-risk customers as cross subsidies between low and high risks are removed.
We are just beginning to realise the countless benefits that machine learning and data could
offer the protection industry, and the underwriting process in particular. However, to avoid damaging the industry, these techniques need to be implemented with careful consideration, involving the entire value chain.
Reza Hekmat is a business development actuary at SCOR and a member of the IFoA AI and Automation Working Party
Balint Bone is data analytics manager at SCOR and a member of the IFoA AI and Automation Working Party