Skip to main content
The Actuary: The magazine of the Institute and Faculty of Actuaries - return to the homepage Logo of The Actuary website
  • Search
  • Visit The Actuary Magazine on Facebook
  • Visit The Actuary Magazine on LinkedIn
  • Visit @TheActuaryMag on Twitter
Visit the website of the Institute and Faculty of Actuaries Logo of the Institute and Faculty of Actuaries

Main navigation

  • News
  • Features
    • General Features
    • Interviews
    • Students
    • Opinion
  • Topics
  • Knowledge
    • Business Skills
    • Careers
    • Events
    • Predictions by The Actuary
    • Whitepapers
    • Moody's - Climate Risk Insurers series
    • Webinars
    • Podcasts
  • Jobs
  • IFoA
    • CEO Comment
    • IFoA News
    • People & Social News
    • President Comment
  • Archive
Quick links:
  • Home
  • The Actuary Issues
  • August 2020
General Features

Machine learning: The deep end

Open-access content Wednesday 5th August 2020
Authors
Francesca Perla, Ronald Richman, Salvatore Scognamiglio and Mario Wüthrich

Francesca Perla, Ronald Richman, Salvatore Scognamiglio and Mario Wüthrich investigate time series forecasting of mortality using neural networks and deep learning techniques

web_p24-27_binary_ikon_00024543.png

Recent advances in machine learning have been propelled by deep learning techniques, a modern approach to applying neural networks to large-scale prediction tasks. Many of these advances have been in computer vision and natural language processing – for example, the accuracy of models built to classify the 14m images in the ImageNet database has steadily increased since 2011, according to the Papers with Code website. Characteristically, the models used within these fields are specialised to deal with the types of data that must be processed to produce predictions. For example, when processing text data, which conveys meaning via the placement of words in a specific order, models that incorporate sequential structures are usually used.

Interest in applying deep learning to actuarial topics has grown, and there is now a body of research illustrating these applications across the actuarial disciplines, including mortality forecasting. Deep learning is a promising technique for actuaries due to the strong links between these models and the familiar technique of generalised linear models (GLMs). Wüthrich (2019) discusses how neural networks can be seen as generalised GLMs that first process the data input to the network to create new variables, which are then used in a GLM to make predictions (this is called ‘representation learning’). This is illustrated in Figure 1. By deriving new features from input data, deep learning models can solve difficult problems of model specification, making these techniques promising for analysing complex actuarial problems such as multi-population mortality forecasting.

The Lee-Carter Model

Mortality rates, and the rates at which mortality rates are expected to change over time, are basic inputs into a variety of actuarial models. A starting point for setting mortality improvement assumptions is often population data, from which assumptions can be derived using mortality forecasting models. One of the most famous of these is the Lee-Carter (LC) model (Lee and Carter, 1992), which defines the force of mortality as:

Screenshot 2020-08-03 at 10.49.33.png

This equation states that the (log) force of mortality at age in year is the base mortality  at that age plus the 
rate of change of mortality at that age, multiplied by a time-index  that applies to all ages under consideration. Like most mortality forecasting models, the LC model is fitted in a two-stage process. The parameters of the model are calibrated, and then, for forecasting, the time index  is extrapolated.

web_p24-27_figure-1.png

The LC model is usually applied to forecast the mortality of a single population – but forecasts are often needed for multiple populations simultaneously. While the LC model could be applied to each population separately, the period over which the model is fitted needs to be chosen carefully so that the rates of change in mortality over time correctly reflect expectations about the future. A strong element of judgment is therefore needed, which makes the LC model less suitable for multi-population forecasting.

Mortality forecasting using deep learning

Recently, several papers have applied deep neural networks to forecast mortality rates. This article focuses on the model in our recent paper (Perla et al., 2020) which applies specialised neural network architectures to model two mortality databases: the Human Mortality Database (HMD), containing mortality information for 41 countries, and the associated United States Mortality Database (USMD), providing life tables for each state.

Our goal is to investigate whether, in common with the findings in the wider machine learning literature, neural networks specialised to process time series data can produce more accurate mortality forecasts than those produced by general neural network architectures. We also want to develop a model that is adaptable to changes in mortality rates by avoiding the need to follow a two-step calibration process. Thus, our model directly processes time series of mortality data with the goal of outputting new variables that can be used for forecasting. Finally, we wish to preserve the form of the LC model, due to the simplicity with which this model can be interpreted.

Convolutional neural networks

Here, we focus on the convolutional neural network (CNN) presented in our paper. A CNN works by directly processing matrices of data that are input into the network, which could represent images or time series. We present a toy example of how this works in Figure 2. Data processing is accomplished by multiplying the data matrix with a ‘filter’, which is a smaller matrix comprised of parameters that are calibrated when fitting the model. Each filter is applied to the entire input data matrix, resulting in a processed matrix called a ‘feature map’. By calibrating the parameters of the filter in a suitable manner, CNNs can derive feature maps that represent important characteristics of the input data. See Figure 2’s caption for more detail.

web_p24-27_figure-2.png

Defining the model

The CNN we apply for mortality forecasting works in a similar manner: we populate a matrix with mortality rates at ages 0-99, observed over 10 years for each population and gender. This matrix is processed by multiplying the observed values of mortality rates with filters that span the entire age range of the matrix and extend over three years, as shown in the top part of Figure 3. The filters derive a feature map that feeds into the rest of the model.

We also provide the model with variables representing the country being analysed and the gender of the population. To encode these variables, we applied a technique that maps categorical variables to low dimensional vectors called embeddings. In other words, each level of the category is mapped to a vector containing several new parameters – specifically a five-dimensional embedding layer, shown in the middle part of Figure 3.

Finally, we use the feature map and the embeddings directly in a GLM to forecast mortality rates in the next year. No other model components process the features before they enter the GLM. This is represented in the last part of Figure 3, which shows the direct connection of the output of the network to the feature layer.

web_p24-27_figure-3.png

Results

We calibrated this model to the mortality experience in the HMD in the years 1950-1999 and tested the out-of-sample forecasting performance of the model on the experience in the years 2000-2016. The benchmarks against which the model was tested were the original LC model and the deep learning model from Richman and Wüthrich (2019), which is constructed without a processing layer geared towards time series data. We found that the out-of-sample forecasts were more accurate than the LC model 75 out of 76 times, and significantly outperformed the deep learning model. Residuals from the models are shown in Figure 4, indicating that that while both deep learning models have better forecasting performance than the LC model, the CNN model fits the data for males significantly better than any other model. In the paper, we also show that the CNN model works well on the data in the USMD without any modifications.

web_p24-27_figure-4.png

Interpretation within the Lee-Carter Paradigm

Deep learning has been criticised as difficult to interpret. We can provide an intuitive explanation of how the convolutional model works in the framework of the LC paradigm for mortality forecasting. As mentioned above, the three sets of features derived with the neural network – relating to population, gender and those derived using the convolutional network – are used directly in a GLM to forecast mortality. We show this mathematically using simplified notation in the following equation:

Screenshot 2020-08-03 at 10.55.01.png

This states that the neural network predicts mortality based on new variables which have been estimated from the data, represented as variables with a ‘hat’. The first two of these Screenshot 2020-08-03 at 10.56.06.png  and   Screenshot 2020-08-03 at 10.56.12.png  play the role of estimating the average mortality for the population  and gender  under consideration, respectively, and in combination are equivalent to the  term in the Lee-Carter model. The third of these variables is a time index derived directly from the mortality data, which is equivalent to the  term in the LC model. This time index is calibrated each time new data is fed to the network, meaning we have eliminated the two-stage procedure of fitting the model and then producing forecasts through extrapolation.

The seemingly complex model presented can therefore be interpreted in terms that are familiar to actuaries working in mortality forecasting.

Francesca Perla is professor for financial mathematics at Parthenope University of Naples

Ronald Richman is an associate director (R&D and Special Projects) at QED Actuaries and Consultants

Salvatore Scognamigliois a postdoctoral research fellow at Parthenope University of Naples  

Mario Wüthrich is professor for actuarial science at ETH Zürich

Picture Credit | IKON
Actuary Banner august6.png
This article appeared in our August 2020 issue of The Actuary.
Click here to view this issue
Filed in
General Features
Topics
Risk & ERM
Modelling/software

You might also like...

Share
  • Twitter
  • Facebook
  • Linked in
  • Mail
  • Print

Latest Jobs

New Fast-Growing Team - Actuarial Systems Development

London (Greater)
Excellent Salary Package
Reference
143762

Actuarial Pension Consultant – Scotland/Remote – Up to £90,000 plus bonus

Edinburgh / Glasgow / Remote working
Up to £90,000 + Bonus
Reference
143761

Part Qualified Pensions Actuary– Specialised Pensions Consultancy - Scotland/Remote - Up to £70,000

Edinburgh / Glasgow / Remote working
Up to £70,000 + Bonus
Reference
143760
See all jobs »
 
 

Today's top reads

 
 

Sign up to our newsletter

News, jobs and updates

Sign up

Subscribe to The Actuary

Receive the print edition straight to your door

Subscribe
Spread-iPad-slantB-june.png

Topics

  • Data Science
  • Investment
  • Risk & ERM
  • Pensions
  • Environment
  • Soft skills
  • General Insurance
  • Regulation Standards
  • Health care
  • Technology
  • Reinsurance
  • Global
  • Life insurance
​
FOLLOW US
The Actuary on LinkedIn
@TheActuaryMag on Twitter
Facebook: The Actuary Magazine
CONTACT US
The Actuary
Tel: (+44) 020 7880 6200
​

IFoA

About IFoA
Become an actuary
IFoA Events
About membership

Information

Privacy Policy
Terms & Conditions
Cookie Policy
Think Green

Get in touch

Contact us
Advertise with us
Subscribe to The Actuary Magazine
Contribute

The Actuary Jobs

Actuarial job search
Pensions jobs
General insurance jobs
Solvency II jobs

© 2023 The Actuary. The Actuary is published on behalf of the Institute and Faculty of Actuaries by Redactive Publishing Limited. All rights reserved. Reproduction of any part is not allowed without written permission.

Redactive Media Group Ltd, 71-75 Shelton Street, London WC2H 9JQ