**Yiannis Parizas discusses an alternative approach to modelling excess levels in regressions**

Excess level, or deductible, is the portion of the loss retained by the insured on each and every claim. It is used by insurance companies to reduce the cost of insurance and make prices more competitive. When modelling general insurance claims, the traditional regression approach is a frequency-severity model on ceded claims, for one or more claim types together. In such a model, excess levels are commonly incorporated as a factor to both frequency and severity. As such, the estimated frequency or severity will reduce with the increase in the excess level, as per the models' relativities.

This article describes an alternative approach to modelling excess levels. The proposed approach involves modelling the ground-up frequency or severity of claims and calculating the expected ceded claim. The results are compared to those obtained by following the current market practice.

**The effect of an excess level on claim severity**

Let's assume severity originates from a bell-shaped, positively skewed distribution. *Figure 1* represents sampled claims from a lognormal distribution. On applying an excess level of £200, the first two bars of the histogram are removed from our data. This sort of data is typical for an insurance company, as it wouldn't get information about the losses below the excess level. Ceded claims is thus a truncated dataset.

When a lognormal is fitted to the ceded claims, the fitted distribution will follow the yellow line. This is exactly what happens when we set the excess level as a factor in our regression models. The ceded claims distribution underestimates smaller claims and overestimates larger ones, thus overestimating expected ceded claim amounts. The higher the excess level, the more the severity overestimation - so the current market approach is more expensive on higher excess levels.

The ideal distribution would be the original, but with zero values below the excess level. To achieve this, we need to fit the ground-up claim severity distribution and derive an analytical solution for the expected value of ceded claims.

With a left truncated dataset, the algorithm will fit the distribution only after the excess level. However, letting the distribution 'free' before the excess level allows for a better fit after the excess level. One-way right truncated or censored models can be used in case pricing, where the analyst wants to slice the distribution, eg lognormal-Pareto, to better fit the attritional claims part.

** The effect of an excess level on claim frequency**

The next step is to adjust the frequency data and then fit the frequency distribution. *Figure 2 *demonstrates how the claims frequency is affected by an excess level. We sampled ground-up counts from a Poisson model with four claims on average. For the ceded frequency we sampled from the ground-up frequency and a binomial model, assuming the probability for a claim being ceded is 90%. The full claims frequency is higher and more positively skewed.

In order to convert frequency to the same basis with our ground-up severity model, we need to model the ground-up claim counts, including the claims we do not know about. The truncated regression modelling method doesn't allow for this because the frequencies are being excluded randomly, based on their severity. To correctly account for this, we use the severity distribution to tell us the proportion of claims that is less than the excess level. We can therefore use our ground-up severity model to estimate what portion of the policy data is expected to be below the excess level.

There are two tested ways to account for the missing claims - increasing the claims count, or reducing the exposure by the proportion of missing claims. My preference is to reduce exposure and this is the method used in the next section.

** Testing the new methodology**

For the purpose of assessing our new methodology, we generated data for 10,000 yearly policies. For each policy, we generated three factors - A, B and C (yes or no) - and an excess level between £200 and £1,000, with steps of £100. We then assumed 5% claim rate, increasing by 25% for each factor. For severity we generated claims from a lognormal distribution with an average of £4,280, increasing by 28% with each factor, and a fixed volatility of £16,130. The modelling approach for both the original and the new method follows that described above. The process outcomes were averaged over 200 iterations to stabilise results.

*Figure 3* compares the expected cost from the new and the old methodology to the actual numbers. The original method overstates the policy cost, as we predicted above. We can also note that the overestimation increases at higher levels of excess.

While the new methodology is a better fit in this scenario, the reader should take into account that the data is artificial and assumes perfect distributions. In reality, claims data is more random, and other factors that were ignored, such as claims inflation, incurred but not reported claims, and outliers, would distort the picture.

A further consideration for personal lines is that we have an actual effect from excess levels, where the policyholder may not claim when the claim is slightly above the excess level. In this case, we can fit a truncated severity distribution and also have excess level as a factor (*Figure 4*).

** Benefits of the new method**

One of the main reasons to use the new method is to reduce model error, which allows better pricing, attracts better risks and increases profitability. The analytical solution for the excess level allows the client to choose any excess level. For example, in personal lines, this could let the client choose any excess level between two limits, with the price being calculated immediately.

In commercial lines or reinsurance, this will provide faster negation when using the deductible or limits to optimise the account. The flexibility of modelling any excess level also allows continuous inflation adjustment of the historical excess level, along with claims inflation. Finally, the model needs no parameter for the excess level, as it is now incorporated in the severity model. The reduction of parameters will therefore make the model simpler and more accurate.

* Yiannis Parizas* is a pricing contractor at NetSim Analytics