Skip to main content
The Actuary: The magazine of the Institute and Faculty of Actuaries - return to the homepage Logo of The Actuary website
  • Search
  • Visit The Actuary Magazine on Facebook
  • Visit The Actuary Magazine on LinkedIn
  • Visit @TheActuaryMag on Twitter
Visit the website of the Institute and Faculty of Actuaries Logo of the Institute and Faculty of Actuaries

Main navigation

  • News
  • Features
    • General Features
    • Interviews
    • Students
    • Opinion
  • Topics
  • Knowledge
    • Business Skills
    • Careers
    • Events
    • Predictions by The Actuary
    • Whitepapers
    • Moody's - Climate Risk Insurers series
    • Webinars
    • Podcasts
  • Jobs
  • IFoA
    • CEO Comment
    • IFoA News
    • People & Social News
    • President Comment
  • Archive
Quick links:
  • Home
  • The Actuary Issues
  • June 2020
General Features

Insurance pricing: the proxy problem

Open-access content Thursday 4th June 2020
Authors
Mathias Lindholm
Ronald Richman
Andreas Tsanakas
Mario Wüthrich

If we want to avoid discrimination in insurance pricing,  we need to be able to take protected characteristics  into account, say Mathias Lindholm, Ronald Richman, Andreas Tsanakas and Mario Wüthrich 

web-p24-26-main-pic---.jpg

As advanced modelling methodologies become widely available to actuaries, the way models are used within financial services is increasingly constrained by legal developments and regulatory scrutiny. Two examples in the UK are the Financial Conduct Authority’s review of general insurance pricing practices and the Information Commissioner’s Office consultation on an AI auditing framework.

A more longstanding and familiar regulation is the EU gender discrimination directive, which requires that pricing models do not discriminate by gender. The risks of inadvertent discrimination with respect to protected characteristics seem to be higher in complex models than in simple ones, as complex models may exploit intricate patterns in data to derive proxies for, say, gender. In addition to the legal and regulatory risks, ethical concerns could arise if models were found to be using unacceptable proxies.

Defining discrimination in such an intuitive way may appear straightforward, but without a rigorous definition of discrimination, it becomes difficult to guarantee that pricing models are free of it. How can we make sure that illegal or unwanted discriminatory factors are not influencing the results of a model?

Our recent research paper ‘Discrimination-Free Insurance Pricing’ (bit.ly/2KLG5CK) proposes an approach to ensuring that the results of actuarial models are not influenced by protected characteristics. This proposed discrimination-free pricing method is a simple add-on to existing pricing methodologies and does not require major changes to insurers’ predictive models. It can remove discriminatory effects from all categories of pricing techniques currently in use, from generalised linear models (GLMs) to gradient boosting machines and deep neural networks.

Method

We start by taking it as given that protected characteristics such as gender are not used within pricing models as rating factors – meaning direct discrimination is avoided. What do we mean, then, when we say that a price may still be discriminatory? We illustrate our ideas with a simple stylised example; a full mathematical definition of discrimination in pricing can be found in our paper.

We consider the case of a simple pricing model for a health insurance portfolio. The two relevant covariates are the policyholder’s gender and age class. The portfolio population is split 50/50 between women and men – shown in Figure 1, together with the split across ages. In this example, 90% of policyholders in the younger age classes are female, with the reverse happening for older age classes.

Figure 2

 

The expected claims costs by age class and gender are shown by the grey and dark yellow lines in Figure 2; we can view these as ‘best-estimate prices’. We can see that, as age increases, claims costs for females and males diverge – in particular, claims costs for females become progressively higher. The question is: how should insurance be priced?

A common method for avoiding discrimination is simply ignoring gender. Then, the insurance rate for a policyholder of any gender at age 50 is nothing but the average cost of the corresponding age class. We call a price calculated in this way an ‘unawareness price’, shown by the black line in Figure 2. It is striking that the unawareness price is very close to the best-estimate price for women at lower ages, and then drops to nearly the best-estimate price for men at higher ages. This is due to the much higher prevalence of women within the lower age classes (90%), as we saw in Figure 1. The unawareness price uses age as a proxy for gender – to be precise, the calculation of unawareness prices implicitly relies on the conditional probability of gender, given age. In summary, ignoring gender in price calculation did not remove its impact on prices. This is indirect discrimination.

Figure 3

 

What should the price be for, say, a policyholder aged 50? We know that gender must somehow be allowed for in the calculation, since ignoring it leads to indirect discrimination. Furthermore, prices should lie somewhere between the extremes given by the grey and dark yellow lines in Figure 2; in particular, the price at age 50 should be a weighted average of the corresponding best-estimate prices for men and women. For age not to be a proxy for gender, these weights should not depend on the proportion of women in each age group. This means that ‘discrimination-free prices’ must be represented by a straight line in Figure 2. Finally, note that the overall population is split equally between men and women. This implies that a horizontal line, as depicted in pale yellow, is a suitable choice for a discrimination-free price.

This stylised example has been constructed with some care in order to clearly illustrate what can go wrong when unawareness prices are used. It shows that, in order to account for discriminatory characteristics, one needs to actually use the very same characteristics as part of the pricing procedure – recall that the intuitively discrimination-free prices we derived were based on best-estimate prices.

In many realistic situations, the differences between unawareness prices and discrimination-free prices may be smaller. Still, to be certain that no indirect discrimination takes place, we need a practical alternative to unawareness prices.

Interpretation

Our discrimination-free pricing formula can be derived by arguing from two distinct directions; we only give a summary of the technical arguments here. First, recall that insurance prices are generally calculated as conditional expectations of claims costs (given the rating factors available). These expectations are sometimes re-weighted, for example assigning a higher probability to some scenarios than the data would imply, in order to derive a profit-loaded premium. Our approach utilises a similar trick. However, for us the aim of the re-weighting is different: the statistical decoupling of discriminatory from non-discriminatory factors, without changing the structure of the predictive model underlying best-estimate prices. Specifically, if u(x,d) is the best-estimate price, depending on both the rating factors x and the discriminatory characteristics d, discrimination-free prices arise from ‘averaging out’ the discriminatory characteristics d: Σ u(x,d)P(d)

A second justification relies on causal inference, a branch of statistics that is attracting increasing public interest, partly due to Dana Mackenzie and Judea Pearl’s 2018 publication The Book of Why. Causal inference uses graphs to represent not just correlations in the data, but also the actual direction of causal effects, which are subsequently estimated from observational data. This allows users to assess the impact of changes in the values of chosen variables while stripping out confounding effects. The pricing formula we propose can (in some circumstances) be interpreted within the framework of causal inference – as representing the direct causal effect of the rating factors on the insurance experience, without confounding by other discriminatory characteristics such as gender.

“In order to account for discriminatory characteristics, one needs to actually use the very same characteristics as part of the pricing procedure”

Applications

We now consider the application of discrimination-free pricing in a more complex model of health insurance claims. We simulated data for three types of healthcare costs, based on the two rating factors of gender and smoking status: costs of birth-related injuries, only applying to females aged 20-40; cancer-related costs, which are higher for smokers; and all other healthcare costs. We also assumed that women are more likely to smoke than men. For the remaining assumptions used in this example, we refer to our paper.

We show the true claims costs (grey and dark yellow lines) based on the model underlying the simulated data in Figure 3, as well as the unawareness and discrimination-free prices. The best-estimate claims costs are consistently higher for women than for men. The unawareness prices for smokers are closer to the best-estimate prices for women, since, in our example, being a smoker is predictive of being a woman. Likewise, the unawareness prices for non-smokers are closer to the prices for men. On the other hand, the discrimination-free prices do not reflect the gender-based information contained in the smoking status – they only capture the direct effect of smoking on the (higher) level of claims produced.

Figure 4

 

Having applied the method to the true claims costs, we now investigate how well the method works on noisy simulated data. For this purpose, the claims of 100,000 policyholders were simulated using the claims cost model discussed, on the assumption that claims costs are distributed according to a Poisson distribution. We then fit a deep neural network to the simulated data to act as our pricing model, considering both gender and smoking status (to derive best-estimate prices) and subsequently estimate discrimination-free prices. Furthermore, we recalibrate the network using only the smoking rating factor, to derive unawareness prices. The predictions from these models are shown in Figure 4.  

Figure 4

It can be seen that the deep neural networks successfully approximate the true claims costs, and that both the discrimination-free and unawareness prices are similar to the true values shown in Figure 3. This leads us to conclude that the method of producing discrimination-free prices works well in the given model.

Avoiding bias(when avoiding bias)

A basic requirement of a good pricing model is that the total costs predicted by the model should be equal to the expected total costs from the portfolio under consideration. Most actuarial models (such as GLMs) fulfil this requirement, but it can be shown that the discrimination-free prices introduced in this article do not. A correction to these prices for this bias is therefore required, the simplest option being pro-rata adjustment.

Conclusions

We have proposed an easily implementable method for removing the effects of discrimination from pricing models by removing the proxying of characteristics such as gender by other covariates. We have provided examples showing that ignoring discriminatory characteristics does not lead to discrimination-free prices, meaning that unawareness prices ignore the wrong thing. Instead, we should include discriminatory characteristics in a model and remove their effect afterwards. Our proposal works for any kind of predictive model – from GLMs to neural networks – and can thus be applied as an add-on to existing pricing models used by actuaries. Mathematical details can be found in our paper.

What our method requires is data on characteristics, whose use may be considered discriminatory. Many such characteristics are not recorded by companies, so development of this work must consider how to overcome this problem. Our claim is that information on discriminatory characteristics is necessary to remove discrimination from pricing. While the technical foundation for this idea is solid, communicating it may not be easy, particularly in view of concerns around privacy.

We have not tried to define which factors should be treated as discriminatory – a societal question beyond our analysis. We recommend that companies should assess whether any rating factors currently used in pricing models might be functioning as problematic proxies from a legal, regulatory or ethical perspective. An example is the use of postal code information within models, since postal codes can correlate highly with ethnicity – 
by applying our method, it might be possible to provide insurance at a more reasonable cost to groups that may have been disadvantaged in the past. This indicates that the broader implications of discrimination-free pricing within specific markets should be considered. 

Dr Mathias Lindholm is an associate professor in mathematical statistics at Stockholm University

Ronald Richman is an associate director (R&D and Special Projects) at QED Actuaries and Consultants

Professor Andreas Tsanakas is professor of risk management at Cass Business School, City, University of London

Professor Mario Wüthrich is professor for actuarial science in the Department of Mathematics at ETH Zürich

image.jpeg
This article appeared in our June 2020 issue of The Actuary .
Click here to view this issue

You may also be interested in...

web-p29-29-Web-©-Ikon-00024455-1.jpg

Stochastic claims reserving models: taking the one-year view

Stochastic claims reserving models are not widely understood – but Robert Scarth wants to change this. Here, he discusses models for one-year risk
Thursday 4th June 2020
Open-access content
web-p36--37-Code-breaking--Marian-Rejewski's-bench-Alamy--KW19FF.jpg

Polish cryptologists: unsung heroes

David Forfar discusses the role of Polish cryptologists – including a would-be actuary – in cracking Enigma during the Second World War
Thursday 4th June 2020
Open-access content
web--p16-18-Balance--Actuaries-protecting-pot-of-gold-illustration--Ikon---00004941.jpg

Striking a balance at the PPF

Stephen Wilcox explains how the PPF manages risks so it can protect millions of members of UK defined benefit pension schemes
Thursday 4th June 2020
Open-access content
p32-34_-Ikon-00017008.jpg

Clarity on climate

There is considerable uncertainty in the output of climate change risk and exposure data – how should decision-makers cut through all this? Russ Bowdrey explains.
Thursday 7th May 2020
Open-access content
iStock-479989332-[Converted].jpg

A fine scale

Elena Kulinskaya, Ilyas Bakbergenuly and Lisanne Gitsels revisit longevity modelling
Thursday 7th May 2020
Open-access content
coverlines

Modelling minds

In a follow-up feature on risk management and personality, Geoff Trickey looks at the significance of risk types in personality in the response to COVID-19
Thursday 7th May 2020
Open-access content

Latest from Modelling/software

EG\

Uneven outcomes: findings on cancer mortality

Ayşe Arık, Andrew Cairns, Erengul Dodd, Adam Shao and George Streftaris share their findings on the impact of socio-economic differences and diagnostic delays on cancer mortality
Wednesday 1st June 2022
Open-access content
dtj

Talking census: making use of data

With the ONS starting to release the data from the 2021 census, Jeremy Keating considers how those working in insurance can make use of it
Wednesday 1st June 2022
Open-access content
hrts

Storm watch: Can IPCC models be used in cat modelling?

Can IPCC projections be used to adjust catastrophe models for climate change? Nigel Winspear and David Maneval investigate, using US hurricanes as an example
Wednesday 1st June 2022
Open-access content

Latest from General Insurance

td

Brain power

The latest microchips mimic cerebral function. Smaller, faster and more efficient than their predecessors, they have the potential to save lives and help insurers, argues Amarnath Suggu
Wednesday 1st March 2023
Open-access content
bl

'Takaful' models of Islamic insurance

Ethical, varied and a growing market – ‘takaful’ Islamic insurance is worth knowing about, wherever you’re from and whatever your beliefs, says Ali Asghar Bhuriwala
Wednesday 1st February 2023
Open-access content
il

When 'human' isn't female

It was only last year that the first anatomically correct female crash test dummy was created. With so much data still based on the male perspective, are we truly meeting all consumer needs? Adél Drew discusses her thoughts, based on the book Invisible Women by Caroline Criado Perez
Wednesday 1st February 2023
Open-access content

Latest from General Features

yguk

Is anybody out there?

There’s no point speaking if no one hears you. Effective communication starts with silence – this is the understated art of listening, says Tan Suee Chieh
Thursday 2nd March 2023
Open-access content
ers

By halves

Reducing the pensions gap between men and women is a work in progress – and there’s still a long way to go, with women retiring on 50% less than men, says Alexandra Miles
Thursday 2nd March 2023
Open-access content
web_Question-mark-lightbulbs_credit_iStock-1348235111.png

Figuring it out

Psychologist Wendy Johnson recalls how qualifying as an actuary and running her own consultancy in the US allowed her to overcome shyness and gave her essential skills for life
Wednesday 1st March 2023
Open-access content

Latest from Ronald Richman

Graphic

Objective automation in reserving

Is there a role for automation in selecting a reserving technique? Caesar Balona and Ronald Richman share their thoughts
Wednesday 7th April 2021
Open-access content

Latest from June 2020

web-UNP-Redactive-39939-John-Taylor-Glasgow.007.jpg

Wellbeing: behind the curtain

During what is proving to be a difficult and stressful time for many John Taylor reflects on his own mental health experiences, and the value of reaching out
Thursday 4th June 2020
Open-access content
web-p21-Brian-Hey-Prize--Trophies-and-medal-prices--Ikon---00010059.jpg

Brian Hey: eyes on the prize

Joseph Lo looks back on some past winners of the Brian Hey Prize
Thursday 4th June 2020
Open-access content
web-p20-22-©IKON--00024562-1n-copy.jpg

Supervision reforms in the frame

Pau Ling Yap and Paolo Cadoni on the recently agreed reforms that aim to improve supervision of internationally active insurance groups
Thursday 4th June 2020
Open-access content
Share
  • Twitter
  • Facebook
  • Linked in
  • Mail
  • Print

Latest Jobs

Actuarial Manager - Non-Life Reserving

London (Central)
£ excellent package
Reference
148614

Senior Reinsurance Pricing Actuary

London, England
£130000 - £155000 per annum
Reference
148611

Capital Analytics Manager

London, England
£80000 - £120000 per annum
Reference
148610
See all jobs »
 
 
 
 

Sign up to our newsletter

News, jobs and updates

Sign up

Subscribe to The Actuary

Receive the print edition straight to your door

Subscribe
Spread-iPad-slantB-june.png

Topics

  • Data Science
  • Investment
  • Risk & ERM
  • Pensions
  • Environment
  • Soft skills
  • General Insurance
  • Regulation Standards
  • Health care
  • Technology
  • Reinsurance
  • Global
  • Life insurance
​
FOLLOW US
The Actuary on LinkedIn
@TheActuaryMag on Twitter
Facebook: The Actuary Magazine
CONTACT US
The Actuary
Tel: (+44) 020 7880 6200
​

IFoA

About IFoA
Become an actuary
IFoA Events
About membership

Information

Privacy Policy
Terms & Conditions
Cookie Policy
Think Green

Get in touch

Contact us
Advertise with us
Subscribe to The Actuary Magazine
Contribute

The Actuary Jobs

Actuarial job search
Pensions jobs
General insurance jobs
Solvency II jobs

© 2023 The Actuary. The Actuary is published on behalf of the Institute and Faculty of Actuaries by Redactive Publishing Limited. All rights reserved. Reproduction of any part is not allowed without written permission.

Redactive Media Group Ltd, 71-75 Shelton Street, London WC2H 9JQ