Paul Moorshead and Aron Bor demonstrate that applying better actuarial models to injury risks in professional football when competing at the top level could offer a form of predictive magic sponge.

With Premier League teams in England generating combined revenues of more than £2 billion (more than any other league in Europe currently), football is a big business. But with players' wages accounting for around 70% of that figure, losing players to injury - particularly star players - is effectively money down the drain. However, it is not just the teams that are affected. As salaries have spiralled, players' downtime is becoming ever more costly for insurers that underwrite sports personal accident cover.
Research released by Towers Watson in June 2013, based on seven years of appearance and injury data (physioroom.com and premiersoccerstats.com), estimated the cost of lost salaries from severe injuries in the 2012/13 season was approximately £95 million and looks set to exceed £100 million in the coming season. Painful reading for the clubs involved and insurers, not to mention the stricken players. Could the teams or insurers have done or do anything to avoid the situation?
While analyses that quantify losses after the fact potentially offer useful benchmarks for insurers, they offer few crumbs of comfort to the affected teams and little insight into what may happen in the future. A more valuable measure is to identify players or groups of players who are more likely to be susceptible to severe injuries - defined in the study as being unavailable for selection for at least 30 days. Such information should be useful not only to insurers, who typically only provide cover for injuries to players beyond 30 days, but potentially to the teams themselves.
Injury trends
To understand how this might work, we need to look in more detail at the generalised linear model produced for the analysis. This shows some distinct trends in injuries. For example, teams delving into the summer transfer market may need to be wary of adding aging squad players, particularly those with a recent history of longer-term injuries. The model shows that players in the 29 to 31 age bracket who sit on the bench for 60% or more of matches carry a very high injury risk. Other trends revealed that there is a strong increase in the likelihood of a player sustaining a bad injury in the season after participating in a major international tournament.
To arrive at these conclusions, Towers Watson studied injury records and game data from nine seasons of Premier League competition - covering some 380 games per season, each featuring squads of 16 to 18 per team, and taking account of player transfers and loan periods. This involved considerable manual cross-checking of other data sources.
The aim was to build a model to predict the number and length of severe injuries suffered by a player in a season based on the information available at the start of a season.

Recognising that teams' fortunes can sometimes hinge on the availability of highly paid star players, the modellers introduced a rating factor to reflect this. Using a 'wisdom of crowds' approach to eliminate any model bias, this identified the top 30% of Premier League players over the study period to be given a star player rating (*1) in the model. In reality, a sports personal accident underwriter may find it easier to substitute the top five earning players in the squad as 'stars', rather than canvassing his football-loving colleagues as we did.
The next question was how to balance the number of seasons of historical playing records in setting the rating factors (*2) against the number of seasons of exposure and injuries. Our approach was to fit nine separate models ranging from one to nine seasons of exposure and injuries (and therefore nine to one seasons of historical playing records for each player). Each of the separate nine models was independently simplified and optimised to be the best fit to the available data for that model. This involved testing trends for consistency over multiple sections of the account, removing factors which failed to improve the model fit sufficiently to compensate for the additional complexity they brought and testing that the factors remaining in the model allowed for the impact of the excluded factors. Due to the size of dataset involved, much of this assessment was subjective as the results of traditional model tests were unreliable. Each model was fitted to only part of the available data, which enabled the final fit to be tested against the hold-out sample.
The problem then remained of selecting which of the nine candidate models was most appropriate. The solution chosen was to assess each candidate model against each of the nine hold-out samples. This quickly led to the exclusion of many candidate models, in particular those focused exclusively on the maximum number of seasons of exposure/injuries or historical playing records. Figure 2 shows the predictiveness of the nine models against a hold-out sample, with the seven season model (and two seasons of historical playing records) coming out on top. However, if we had more than nine seasons of data to play with then it might be true that more than two seasons of historical playing records is optimal. It is clear from the chart below that the model with no historical playing record data is least predictive.

Key findings
On further examination, what the model shows is that success comes with a price, particularly for those teams who earn the right to play in European club competitions. A player who plays for a team that finished in the top seven places in the previous season is more likely to suffer an injury than a player from a lower team. This is exacerbated for teams finishing as champions and runners-up. What is more, players in the more successful clubs take longer to heal, with those from the top five being out of action for longer than the league average.
Furthermore, among players who take part in major international tournaments, such as the recent Confederations Cup, there is a 30% increase in frequency of severe injuries in the following season. But such players tend to recover more quickly than average, hence mitigating the impact somewhat.
Managers at the top clubs who rant about the misfortune of injuries at least have the consolation of challenging for silverware. But, a number of the injury trends identified are indiscriminate and are equally or more likely to afflict less successful teams.
Once a player has had a number of severe injuries within the past two seasons, they are much more susceptible to another in the subsequent season. The study shows that this is the principal factor in determining the likelihood of future injuries across the league.
Interestingly, star players suffer slightly fewer injuries than average and recover more quickly than average, possibly as clubs' resources are focussed on getting them back on the pitch.
The importance of resting players is also evident. Unsurprisingly, playing in more games and longer in those games leads to more injuries - particularly where players are consistently on the pitch for the last 20 minutes. This can add hugely to a player's propensity to suffer injury. This statistic underlines how teams in the middle or lower reaches of the league, that have to operate with smaller squads, can be particularly adversely affected. Also, many cannot afford to take the risk of not having their best players on the pitch.
That is not to say that big squads are necessarily the answer - at least from an injury risk perspective. Bench warming squad players suffer many more injuries than regular starters, particularly if they are relatively experienced. Not playing at all in 60% or more games when in the squad leads to a big increase in injuries.
Our analysis led to the finding that certain clubs have materially better or worse experience than others even after allowing for all the factors considered. Given the amount of data considered, we would not have expected random variation in the data to have caused this and so it leads to the conclusion that certain clubs are better at treating and preventing injuries. This could be due to different training routines, training surfaces, style of play or optimal squad rotations. We will seek to investigate this further in any future analysis.
Scientific approach
Not so many years ago it would have been heresy to suggest that science and statistics could bring anything to the beautiful game. But now they greatly inform approaches to training and nutrition and the greater use of statistics in measuring player performance. Such developments have become accepted facets of today's football scene, where top players earn more in a week than the average person earns in five to 10 years. Yet injury can, in the blink of an eye, still rob clubs of their prized assets for weeks and months on end, leaving them and their insurers to pick up the tab.
There is nothing that actuaries, or anyone else for that matter, can do about the inevitability of injuries. But, just as it has occurred in other areas of commercial insurance, data and models can help understand what contributes to them and the underlying risks.
Paul Moorshead has more than 18 years of experience as an actuarial consultant and is a Fellow of both the Chartered Insurance Institute and Institute and Faculty of Actuaries. He has expertise in successfully enhancing the underwriting of commercial lines of business, such as sports personal accident, through applying sophisticated pricing techniques. Paul has also been quoted and prepared articles for multiple national newspapers, magazines and insurance journals on the topic of motor insurance.
Aron Bor is a junior associate at Towers Watson and is currently studying towards becoming an associate of the Institute and Faculty of Actuaries after graduating with a BSc in Mathematics from Imperial College London last year. He is a supporter of Manchester United Football Club, and one of the few who actually grew up in Manchester.