[Skip to content]

Sign up for our daily newsletter
The Actuary The magazine of the Institute & Faculty of Actuaries

Smoothing things over

Steve Mills finds out what Solvency II models can gain from looking at sports player ratings 


The more data you use, the more accurate your calibration will be – but that’s only if the characteristics of the risk have not changed over time

I was looking at the FIFA rankings back in August 2018, as you do. It was a list of all the world’s international football teams in order of strength. Compared to the previous ranking, two months earlier, there had been some big movements. Croatia had jumped from 20th to 4th; Germany had fallen from 1st to 15th. Why were there such big movements?

Well, France was at the top of the list at the time because its overall rating of 1,726 was higher than that of all the other countries. Every time France plays in an international, it is given a rating for that match. This rating depends on the result, the rating of the opposition and (possibly) the importance of the match. The determination of this individual match rating isn’t important here; what is important is that France’s overall rating of 1,726 was the average of the ratings that France scored for each match during the previous four years.

That’s why weird things happen in the FIFA rankings. Germany’s rating fell sharply in just two months not just because they had a poor 2018 World Cup, but also because they had had a great 2014 World Cup, putting seven past Brazil in the semi-final and ending up as the winner. All of those results in the 2014 World Cup dropped out of the calculation in 2018, as they were now more than four years old. As far as the FIFA rankings are concerned, everything during the past four years is 100% relevant, whereas everything before it is 100% irrelevant.


The ICC cricket player rankings are much better behaved. They are also derived by ordering by overall ratings, with these being derived from bowling or batting ratings that have been calculated on an innings-by-innings basis. The overall ratings vary more smoothly over time, meaning that players tend to drift slowly up or down within the rankings rather than making big jumps. How does the rating system achieve this?

Whereas FIFA gives equal weightings to all match ratings within a four-year window and no weightings outside it, the ICC ratings are weighted averages. A player’s rating after an innings is set equal to x times his rating for that innings plus (1-x) times his rating before that innings for some fixed x between 0 and 1. The result is that his rating after the innings is the weighted average of all his innings ratings during his entire career, with the size of the weightings decaying exponentially as the data gets older.

This not only smoothes out overall ICC ratings (and therefore positions in the ranking order), but also makes them more up to date, placing more weight on the most recent data. If FIFA used this methodology, Germany would have slowly drifted down the ratings during the past five years, with the memories of 2014 gradually fading rather than being wiped out overnight like Snowball’s heroics in Animal Farm.

It’s an interesting approach. And that brings me on to Solvency II internal models.

Internal models

In calibrating internal models, companies will tend to fit probability distributions to historic data. There will be a lot of thought put into how far back the calibration data should go. The more data you use, the more accurate your calibration will be, but that’s only if the characteristics of the risk have not changed over time. There will be strong arguments for not using the oldest data because it’s starting to look out-of-date and irrelevant. It’s a tough call.

Companies have made those calls, though, and calibrated their models using data from a carefully chosen time window. What happens when they recalibrate their models, though? If they calibrated using x years of data and then need to recalibrate the model one year later, will they just add on an extra year’s data? Maybe, but I expect they’ll eventually find that the oldest data is increasingly irrelevant. At some point they’ll start chopping off some of the earliest data, and that’s when they’re in danger of seeing their solvency capital requirement (SCR) move around like Germany’s FIFA rating when significant historic events drop out of the calibration data.

So, are there alternative calibration methodologies that can take a lead from the ICC, with calibration parameters estimated using exponentially decaying weighted averages of the underlying data?  Would firms (and regulators) rather see SCRs that behave like FIFA ratings, or SCRs that behave like ICC ratings?



To see how this might work, I put together a simple experiment. I used some Dow Jones Industrial Average index data going right back to 1900, and fitted a normal distribution to log returns for each of the last 100 years, estimating the mean and standard deviation three different ways:

  • Using all data from 1900 to the calibration date
  • Using a 20-year rolling calibration window (akin to FIFA ratings)
  • Allowing the estimates to evolve smoothly over time by using exponentially decaying weightings with a weighting of 0.05*0.95n applied to the data from n years ago (akin to the way ICC ratings work)


The results are presented in a graph.  

Some features to notice are:

  • The all-time calibration in green produces the highest capital requirements today, as a result of continuing to use data from the first 40 years of the 20th century – data that is given little or no weighting in the other two methodologies
  • The size of the 1 in 200 stress plummets in the early 1950s, when the impact of the 1929 Wall Street Crash drops out of the rolling calibration window (blue line) – would today’s regulators have been comfortable with this if they were transported back in time? Instead, the brown line shows how the 1 in 200 stress would reduce smoothly using the ICC methodology. 
  • The oil crisis in the 1970s and financial crisis in 2008 would have caused bigger increases in capital requirements for firms using rolling data windows than for those that adopted a smoothed approach. Would firms be comfortable with such potential volatility within their capital requirements?


Does an approach similar to that used by the ICC sound like one that would appeal to both firms and regulators?

Steve Mills is the owner and director of SSC Actuaries