Andrei Badescu, Tsz Chai Fung, X Sheldon Lin and Spark Tseung present a flexible nonlinear regression model and software for insurance risk classification, ratemaking and reserving

Generalised linear models (GLMs) or, more generally, generalised additive models (GAMs), have been widely used as regression models in insurance – particularly for ratemaking and reserving in general insurance. Their popularity lies in their capability to work with granular insurance data, their simple and easy-to-interpret linear structures, and the wide availability of software that can implement these models.
However, GLMs/GAMs can perform quite poorly in many scenarios. This issue is often encountered in large insurance portfolios, which are rarely homogeneous, involving high-risk and low-risk policyholders who cannot be adequately modelled by a single parametric distribution. In addition, the underlying relationship between covariates (such as policy attributes) and responses (such as claims from the policy) may be so complex that linear structures will not provide a good approximation. For policies with multiple coverages (for example physical damage and bodily injury in auto insurance), GLMs/GAMs can only describe the marginal distributions, and additional modelling assumptions are needed to capture the dependence structure among various coverages.
To address some of these shortcomings, we propose a flexible nonlinear regression model – the logit-weighted reduced mixture-of-experts (LRMoE) model. The LRMoE classifies policyholders into various risk groups, then models each group using a parametric distribution. This gives the LRMoE the flexibility to capture a wide range of data phenomena. The model outperforms classical GLMs while retaining a relatively simple structure and providing intuitive model interpretation. It is also readily available for use as open-source software packages, written in both Julia (a high-performance programming language) and R.
The LRMoE can be used to model both claim frequency and severity for either single or multiple insurance coverages, taking into the account the dependence structures that may exist among them. Using a simple claim frequency example, we will illustrate the LRMoE’s performance in comparison with the classical GLM.
“Using LRMoE models, we can analyse the complicated data structures that appear in actuarial applications”
Where does Poisson GLM fail?
We begin by revisiting the commonly used Poisson GLM on an Australian automobile claim dataset (available as ‘ausprivauto0405’ in the R package ‘CASdatasets’). We fit the claim frequency to the Poisson GLM, using a set of covariates that include driver gender and age, and vehicle type, weight and price. The empirical and fitted distributions of claim frequencyare shown in Figure 1.

While the Poisson GLM gives a reasonably good fit on the probabilities of zero and one claims, the tail is severely under-fitted. This is not surprising, given that most policyholders have at most one claim, which plays a dominant role when fitting GLM. However, actuaries should be more concerned about getting a better fit for the tail of the distribution, where a misfit may produce severe losses. In addition, an ideal model would help practitioners identify exactly which policyholders are riskier than others, leading to more appropriate ratemaking and reserving estimates.
The Poisson GLM fails to address these needs, due to the lack of goodness-of-fit in the tail of the distribution. Besides, for automobile insurance, policyholders with a known claim history are usually more likely to file claims again in the future. When working with a simple GLM (the Poisson GLM in our case), there isn’t a straightforward and systematic method for incorporating past claim history into ratemaking for the future (although it is still possible – for example by manually imposing a risk premium).
Despite its simplicity and ease of implementation, the classical GLM suffers from many other flaws that potentially produce poor future predictions. Next, we present an illustration of our proposed LRMoE class, which proves a better model candidate for use in real applications.
“LRMoE outperforms classical GLMs while retaining a relatively simple structure and providing intuitive model interpretation”
A flexible and intuitive approach
The LRMoE model resembles a divide-and-conquer approach. Given the covariates, each policyholder is first classified by a logistic regression (or softmax function) into one of n possible latent risk groups (n = 3 in our case). Each latent group is then associated with a distribution at the discretion of the actuary (in this case, for comparison reasons, we choose Poisson), the so-called ‘expert function’, for modelling claim frequency.
Consequently, the claim frequency of each policyholder is described by a mixture of three Poisson distributions instead of one, with the parameters shown in Figure 2. While the same expert functions are shared by all policyholders, the latent group probabilities (p1, p2 and p3) vary according to the covariates of the policyholders, resulting in potentially different frequency distributions across policyholders.
The mixture structure greatly improves the modelling flexibility, as seen in Figure 1. The LRMoE model not only fits the main body (zero or one claim) well, but also provides a better fit for the tail (two or more claims) when compared with Poisson. Moreover, the LRMoE model has an intuitive interpretation. The fitted parameters in Figure 2 indicate the presence of three distinct risk groups, with average claim frequency being 0.34, 0.13 and 0.08, meaning the riskiest policyholders (dark yellow) could be four times riskier than the safest (white).

Figure 3: Change of latent group probabilities of selected policyholders.
For ratemaking purposes, information about past claims can be easily used by the LRMoE model, providing an accurate and up-to-date description of the policyholder’s risk profile. To illustrate this, let us consider two policyholders, A and B. Based on their covariates only, for policyholder A the LRMoE model fits an average claim frequency of 0.1738, while for policyholder B the fitted average claim frequency is 0.1642. We may conclude that both A and B have a similar degree of riskiness (mid-risk). Now let’s bring claim history into play. Assume that over the course of time, we observe no claims from policyholder A and four claims from policyholder B. Incorporating this information into LRMoE, the model predicts the new average claim for A to be 0.1590 – slightly lower than before, given a safe driving history. On the other hand, if policyholder B should be renewed, LRMoE suggests an average claim frequency of 0.3383 – a significant increase, due to a dangerous driving history. This puts policyholder B in the high-risk class.
Here is what happens behind the scenes. Essentially, a policyholder’s claim history would influence the latent group probabilities (p1, p2 and p3), changing the resulting frequency distribution and the model prediction. Figure 3 shows how the risk profiles of A and B have been affected by their claim history, which is consistent with the numbers above.

Finally, we look at how the LRMoE models nonlinear relationship between covariates and responses. Figure 4 illustrates the influence of vehicle price on average claim frequency and latent group probabilities (p1, p2 and p3). For the claim frequency, the Poisson GLM is restricted to an exponential relationship defined by its model structure, while the curve given by LRMoE could be of any shape (although this is not obvious here). For latent group probabilities, each of the three curves is part of a logistic function (or sigmoid function), which is commonly used in machine learning for modelling nonlinear relationships.
What else can we do with LRMoE?
The analysis on the Australian automobile dataset is a toy example, showing that the new LRMoE model yields a better fit compared to GLM while maintaining a relatively simple and interpretable model structure. Generally, using this flexible class of LRMoE models, we can analyse the complicated data structures that usually appear in actuarial applications. The theoretical properties of LRMoE, together with computational tractability and statistical interpretability, provide the guarantee of obtaining very accurate predictions, offering significant improvement on classical GLMs.
We hope LRMoE will help insurance practitioners solve complex real-life problems. With this in mind, we have designed the LRMoE package, written in R, and the LRMoE.jl package, written in Julia. In addition to the functionalities presented here, the packages offer several features:
- A wide collection of distributions commonly used for insurance frequency and severity modelling. It is also possible for users to customise their own expert functions suited for a specific problem.
- Multivariate responses with potentially complex dependence structures can also be modelled. Such dependence structures cannot be easily accounted for in the classical GLM framework.
- Incomplete data is common in insurance contexts – for example data censoring due to policy limits and data truncation due to deductibles. Our package can conduct parameter estimation with incomplete data.
- A collection of functions is also provided for insurance ratemaking, risk management and model visualisation.
The source code and documentations of the Julia package are available at github.com/sparktseung/LRMoE.jl and sparktseung.github.io/LRMoE.jl/dev, and implementation details may be found in the following paper: Tseung SC, Badescu A, Fung TC and Lin XS. LRMoE.jl: a software package for flexible actuarial loss modelling using mixture of experts regression model. Annals of Actuarial Science 2021; 15(2), 419-440.
For actuaries who are interested in the R package, we refer to the link github.com/sparktseung/LRMoE and the paper: Tseung SC, Badescu A, Fung TC and Lin XS. LRMoE: An R Package for Flexible Actuarial Loss Modelling Using Mixture of Experts Regression Model’, 2021 (available at bit.ly/LRMoE).
The Julia package is highly recommended as Julia is a more efficient programming language, with a computing speed roughly four times faster than that of R.
We hope our LRMoE model and software packages can help actuaries address some of the challenges in insurance modelling. We look forward to collaborations with insurance practitioners, and welcome feedback and suggestions to improve our work.
Dr Andrei Badescu, Professor in actuarial science, University of Toronto
Dr Tsz Chai Fung, Assistant professor in actuarial science, Georgia State University
Dr X Sheldon Lin, Professor in actuarial science, University of Toronto
Spark Tseung, PhD student in actuarial science, University of Toronto