[Skip to content]

Sign up for our daily newsletter
The Actuary The magazine of the Institute & Faculty of Actuaries

Modelling: Relieving the pressure

I was meeting our marketing manager the other day to discuss public relations for our latest innovation. This is an extract of how the conversation went.

“Relieving the pressure from tight-fitting models — excellent racy title,” he said. “Tell me, where are these tight-fitting models and how can I help relieve their pressure?”

“Good question,” I said. “We seem to come across them more and more these days during our pricing work.”

“Really,” he said, smelling a scandal. “This could be one for the Daily Mail.”

“Not quite,” I countered. “We do help to build them up, but it’s all above board, honestly. They are quite beautiful, I should show you.”

“You help to build them! How about the Daily Star?” he said. “This doesn’t have anything to do with one of those gender things you guys have been stressing about, does it?”

“Gender is one of the factors and there will be a ban on that soon,” I agreed. “But it’s more the shapes of the models I’m interested in.”

“Curves?” he asked.

“Indeed, it could be smooth splines, pretty parabolas or plain old step functions,” I replied.

“Well, I don’t know many models with those sort of characteristics,” he retorted.

“Think technical — maybe cars,” I added — knowing his penchant for sleek metallic lines.

“Now we’re talking,” he said. “Maybe The Sunday Times’ motoring section?”

“Well, they could be car or household, or any kind of behaviour model actually. What is important is that we can use them to spot the patterns, while ignoring the noise.”

“But who wants to read about that?” he quizzed.

“Our customers for one,” I said. “The statisticians and actuarial types
that play with models, too — they’ll love it.”

“Maybe we can tell The Actuary then,” he said. “What’s your angle?”

Figure 1

“Simple,” I said, knowing that it was anything but. “If you use some data to make a generalised linear model (GLM), then as you build it up, it captures more and more patterns.

“But if you go too far, the model ends up replicating the random noise in the data, too.

“The crux of the problem is that you don’t want that noise, because you’re not trying to produce a model that fits slavishly to the data. Instead, you want to predict the next set of results instead.”

Figure 2

“Ah, so you can use it to win the lottery?” he asked.

“Sadly not,” I said. “Because that model is all noise and no pattern. What we do, though, is test the model against a hold-out sample that we secretly kept back, but which we already know the results for.”

“Isn’t that cheating?”

“Well, no, just a good test to prove the value of what you’ve created.”

Figure 3

“They’re going to want to know what the value is,” he said, warming to the theme.

“Okay,” I said. “But first let me tell you why this is a big deal.

“You see, we have discovered a way of using the data twice. Once to build the model, and then also as part of the hold-out sample, to scale back the parameters in such a way that the noise is reduced.

“But how can it be in the model and be in the hold-out?” he countered, perhaps sensing a ‘BBC Panorama’-style sting.

“Statisticians call that cross-validation,” I said. “You hold out a data point and fit the model, then you repeat for each one in turn. Except that we have found a practical way to approximate this without refitting each time.”

“Is that the cool bit?” he asked.

“Kind of,” I said. “Taking something from theory to real-world application is pretty cool, but the ‘eureka’ moment was using that to solve the second bit to get the scale-back ratios.”

“Was that a GLM too?”

“No, actually it turned out to be non-linear — a GNLM of sorts, because all the parameters are correlated and you need to compensate for those non-linear effects,” I replied.

Figure 4

“Right, but the value?” he probed.

“We are at the early stages of that, having only done a few tests so far. But we got about half of one per cent on frequency, another half per cent on severity and a further 3% on the bodily injury bit, making about 2% overall.”

“2% — is that good?”, he asked.

“Well, it’s not bad when you consider the overall size of the market and what it could mean for profitability,” I proffered.

“Great. The Actuary it is then. But, tell me something, if my model is a bit noisy, but has attractive parameters — ‘blonde’ and ‘six foot’, what would you scale them back to?”

Statisticians and actuarial types of unbiased gender can find out more from the March sessional paper presented to the Institute and Faculty of Actuaries, or by coming along to a workshop on the topic of tight-fitting models at this year’s GIRO conference.

Tony Lovick

Tony Lovick is a pricing actuary at Towers Watson