The magazine of the Institute & Faculty of Actuaries
.

# Annuitant mortality and GLMs

ver the past few years the field of annuitant mortality has become one of the most discussed actuarial subjects, to the extent that even financial journalists routinely touch on the matter. A small change in expected mortality rates can have a substantial effect on an insurer’s embedded value and solvency, or the solvency position of a pension fund, and insurers actively writing annuities will of course be aware that such changes can have disproportionate effects on the competitiveness and profitability of their business.
A further complication arises when considering mortality improvements, again an area where apparently small changes can have large effects on results. Most of the debate about annuitant mortality revolves around how these improvements should be allowed for. Deciding on the right allowance involves quantifying the expected future effect on mortality of year of birth, but how are companies to analyse, simultaneously, year of birth and age of policyholder?
In this article we outline how generalised linear models (GLMs) can assist with the analysis of annuitant mortality, and we show some of the interesting results that can be observed. GLMs are well-established tools in analysing claims experience in general insurance personal lines business. Such analysis is usually the first step in pricing such business. As we discuss below, many of the reasons why GLMs are used to model, for instance, motor third-party claim frequency are also valid reasons for their use in modelling the mortality of annuitants.

GLMs: what, why, when?
The mathematical structure of GLMs is beyond the scope of this article. For our present purposes, it is enough to be aware that a GLM can model any observed event (for instance, motor third-party claim frequency, or annuitant deaths) as a function of various factors (for instance, age, sex, and occupation) in such a way that:
– a multiplicative [*see note right” relationship is modelled: for example, we can model
Probability of event (eg death in year) = Base level for observed population x Factor 1 (based on age) x Factor 2 (based on sex) x Factor 3 (based on occupation) x ;
– the model takes account of the underlying probabilistic process involved;
– the model takes account of correlations in the data.
This last property, that correlations are automatically taken account of, is one of the major advantages of GLMs over normal ‘one-way’ analyses in which the experience of a portfolio is considered from the point of view of just one factor (in isolation) at a time, any correlations between factors being ignored. With portfolios of annuity data, it is common for correlations to exist between, for instance, annuity amount, occupation and region: any analysis that fails to allow for such correlations is likely to produce misleading results.
Other notable advantages of GLMs are their transparency (it is simple to see what is going on in the models, and simple to calculate statistical ‘diagnostics’ which quantify the model’s validity), their robustness (small changes in the data lead to correspondingly small changes in the results), and their acceptance in the insurance sector (most major composites are likely to have GLM resources available on the non-life side).
When, then, can GLMs be used? This brings us to what can be a disadvantage of GLMs the requirement for a substantial body of data. A good benchmark here is that the portfolio should involve at least 1,000 or so of the events to be analysed (in this context, annuitant deaths) over the period in question. This requirement explains why GLMs have made relatively little headway in mortality studies for ‘normal’ ages, where the annual probabilities of death of the order of 1 per mil seem to lead to requirements for large portfolios. However, in the context of annuitant mortality where these probabilities are of the order of 1%, a company could have an acceptable database for analysis with a portfolio of as few as 1520,000 policies looked at over five or so years.

Case studies
It may be interesting to look at some of the results from a recent study we conducted on annuitants belonging to a large UK pension fund. The data allowed us to analyse around 300,000 ‘man-years’ of experience, with information available on the following factors (note that we are discussing here only male lives):
– age
– year of birth
– marital status
– pension size
– postcode
– retirement type
In addition to these factors, it is normal to break all of the annuity records down into the constituent calendar years of experience, thereby allowing the creation of a ‘calendar year’ factor. The model can use this factor to take account of any overall trends in the experience. This allows GLM analyses to be based on a longer period than the 34-year period generally considered a safe maximum for simpler investigations, and so allows companies to learn more from their data.

The iterative process
First, how were the results derived? Analyses of this type are heavily iterative. Initially, all possibly relevant factors are looked at, with the factors broken down into the most detailed groupings practicable (for instance, age might be broken down into individual years). The results of each run will lead to such conclusions as:
– Factor ‘X’ is not statistically significant in explaining the event being analysed, and should be dropped from the analysis.
– The groupings used for factor ‘Y’ are inappropriate and should be changed (or, when using splines, the degree of complexity of the spline should be changed for that factor).
After several runs, making such changes to the model specification at each stage in a gradual way, users will reach a model that seems to be the best ‘explanation’ of the observed experience. This final model can be surprising: factors which might have been assumed to be significant may in fact not be significant, and vice versa.

Specimen results
Space forbids a complete description of the results obtained, or the extra investigations that would normally be run (for instance, regarding possible interactions between factors), but it may be interesting to look at some of the results (albeit with a word of warning: the results of GLMs are only truly meaningful when the results for all factors are considered in conjunction).
As expected, the amount of the annuity was found to be strongly significant, as shown in figure 1, although the results (the green line) are markedly different from what would have been observed using a simple univariate analysis (the orange line). (Note that these results are shown in a ‘raw’ state, without further grouping of the data.)
Postcode (when grouped simply into geographical region) was found to show no significant effect. Our analysis showed that the ‘information’ in this factor, which is normally considered a proxy for socio-economic class and lifestyle, was being largely picked up by the annuity amount factor. However, other categorisations of postcode, for instance using commercially available lifestyle groupings, can be more predictive.
Retirements for reasons of redundancy were found to be associated with mortality approximately 10% greater than that of normal or voluntary early retirements. And, no surprise here, year of birth showed a very strong cohort effect, as shown in figure 2.
Such information is obviously useful in setting appropriate mortality assumptions, especially as regards the cohort effect. Unfortunately, the past is not a perfect guide to the future: any decision regarding (for instance) the expected mortality of an 80-year-old born in 1940 is not going to made necessarily reliable by analysing data where all of the 80-year-olds were born before 1925. A large degree of subjectivity and judgement still remains in such cases.

Calibrating against standard mortality
As well as using GLMs to indicate the most appropriate mortality rates with no prior constraints, we can also use the GLMs to model the observed mortality effect in excess of (or below) that predicted by any given standard mortality table.
In the case of this pension fund, the results for year of birth, if we base the model around the PMA92 table with medium cohort projections, show a broadly flat result (beneath a lot of ‘random noise’) indicating that the shape of that mortality basis was a reasonable reflection of the cohort improvements underlying that scheme’s experience during the period considered (see figure 3).

Other studies
The above case study gave a feel for the information that can be obtained from analysing the experience of a pension fund. Our analyses of various annuity portfolios have, to date, shown some interesting features in common with that case study but of course every portfolio retains its own peculiarities.
One annuity portfolio, for instance, showed that annuitants with escalating policies exhibited mortality some 30% higher than those with fixed annuities; another showed a marked anti-select effect carrying on up to five years from inception; another showed year of birth to be a more significant factor than age. Every ‘surprise’ needs to be considered carefully, generally with follow-up investigations, especially if the factor is being considered as a possible rating factor for new business rather than in the more stable context of assumptions for financial reporting.

Other applications
The data requirement noted above has tended to give people the idea that term assurance portfolios are out of bounds to GLMs. However, given the extended period of investigation made feasible by the use of calendar year as an explanatory factor, and given the opportunity to categorise some factors more broadly than would be ideal (for instance, using five-year groupings of age), companies may be surprised to find how much information can be derived from their data.
GLMs can also be used to analyse fields quite different from mortality: any probabilistic event can be studied, given the relevant data. One particularly interesting use of GLMs is to analyse the lapse and surrender rates of different product classes. This is critical information in pricing general insurance personal lines business, and there is no reason why robust persistency investigations should be considered the domain of the non-life sector.

06_06_02.pdf