Richard Plat discusses how deep learning techniques could allow insurers to use image recognition when assessing claims
Machine learning models are on the rise within the field of insurance. Insurance companies are investigating the possibility of applying these models in the areas of pricing, fraud detection and reserving, among other things – but for these purposes, more traditional actuarial models (such as linear or logistic regression) can also be used.
In the near future, image recognition may have a place in the processes of insurers, for example in estimating car damage on the basis of photos. Traditional actuarial models are not sufficient for this, and specific machine learning techniques – also referred to as deep learning – are needed.
The structure of the data
To understand how machine learning models can recognise images, it is important to know the structure of the data (in other words, the images). An image is made up of pixels, each with a specific colour. This colour is made up of a certain amount of red, green and blue. The amount of each of these colours in a pixel is represented by a number between 0 and 255, as shown in Figure 1. The left side of the figure shows how a picture of a face is made up; on the right side of Figure 1, it is zoomed in on the left eye.
An image as a whole can be expressed as a series of numbers, and this series of numbers therefore serves as input for the model.
In this example, the purpose of the model is to recognise whether a car has damage or not. If an insurer collects more specific information, such as the amount of damage, the model can be further refined.
The available dataset for calibrating the model contains 1,840 photos of cars, half of which have damage. A total of 460 photos are available for validation of the model, evenly distributed between cars with and without damage. Examples of photos in the dataset are given in Figure 2.
Machine learning models applied
A multitude of machine learning models are available, including different regression models, decision tree-based methods and neural networks. For image recognition, the so-called ‘convolutional neural network’ (CNN) is the state of the art. A CNN is a type of neural network with specific characteristics that make it suitable for image recognition. Techniques such as CNNs are also often referred to as deep learning, because deep neural networks are required to appropriately recognise images. An important element in a CNN is the ‘convolutional filter’, the operation of which is shown in Figure 3.
The figure shows that the convolutional filter, which in this example consists of nine weights to be calibrated, translates a 3x3 pixel plane into a single number with simple multiplication. By shifting this filter over the entire image, a new image is ultimately created, whereby the relationship between pixels in the original image is addressed by means of the filters. In this way the model can recognise patterns in the image. Multiple filters can be applied to an image at the same time, which are collected in one layer of the model. The model consists of different layers. In addition to the filters, the model contains layers where either a non-linear operation is applied or the largest values are collected (‘pooling’ layer). During the training, the CNN learns the weights associated with the filters (the blue square in Figure 3) or the other layers. The strength of the model is that it automatically:
- Recognises detailed features such as borders, colour changes and shapes
- Recognises shapes based on this, followed by features such as faces, parts of a car, and so on
- Finally, based on these characteristics, it makes an estimate of what is in the image (with associated probabilities).
A CNN model has many parameters, which implies that many images are also needed to estimate the model reliably. These are generally not available from insurers for this purpose. A solution for this can be to take an already existing (and calibrated) CNN as a basis and only (re)calibrate the last layer based on the available dataset. This is called ‘transfer learning’. In the example, the VGG16 model is used, calibrated to the ImageNet dataset (which has approximately 15m images) (bit.ly/3cwioMc).
Five different CNN variants have been applied to the dataset, with variations in, among other things, the number of layers and the number of filters per layer. In addition, a transfer learning approach based on VGG16 has also been applied. The models are calibrated to the training dataset, and the accuracy of the models is then measured based on the validation dataset. The calculations are performed using the Keras package in Python. Table 1 shows the percentage of photos for which the model correctly determined whether the car had damage or not.
The table shows that the accuracy of the normal CNN variants does not exceed 80%. In addition, there is probably overfitting, because millions of parameters are used. The quality of transfer learning based on the VGG16 model shows significantly better results. Apparently, the parameters calibrated on the basis of the general ImageNet dataset are applicable as a basis for specific datasets, such as the dataset featured here.
The results of this case study imply that image recognition can achieve high accuracy with a relatively small dataset. If insurers collect datasets with more images of claims and the associated costs of the claims, then these types of models could be included in the processes of insurers to be able to automatically estimate the extent of car damage. In the future, the models mentioned could also be used for other purposes where images play a role.
Dr Richard Platis partner of Risk at Work, which provides quantitative consulting services to insurers and banks