The magazine of the Institute & Faculty of Actuaries
.

# Darwin to the rescue!

Modelling a life insurance contract using the method of Monte Carlo simulation is now familiar territory to many actuaries. The advantage of this approach is that it allows us to value complex investments or liabilities that cannot usefully be assessed using deterministic models. The disadvantage is that it involves running complex models thousands of times 5,000 scenarios is typical which can tie up banks of computers for days at a time. As well as being expensive, there are times (M&A exercises spring to mind) when we can’t wait that long for the results. It is therefore advantageous to reduce the number of scenarios to be simulated if this can be done without compromising the outputs. A number of ‘variance reduction’ techniques exist to do just this.
One such technique is to carefully choose a relatively small number of scenarios (eg 1,000) from a much larger set of randomly generated scenarios, ensuring that the subset has the same statistical properties as the larger set, particularly the arbitrage free property. But how to find such a subset? We can quickly work out whether a particular subset has the desired characteristics (this takes around a tenth of a second). The problem is that we may have to try many thousands of subsets before we find one that meets our criteria.

An optimisation problem
For this problem, there are nCr possible (or ‘trial’) solutions where n is the size of the larger set, and r is the size of the subset. In the example below, where n=2,000 and r=1,000, there are approximately10,600 possible trial solutions (using Stirling’s approximation). Clearly, even at 0.1 seconds per evaluation, it is not practical to systematically search through all of these. In addition, for any single trial solution (ie a subset of 1,000 scenarios), its immediate neighbourhood consists of the 1,000,000 subsets reachable by changing a single scenario. Hence, optimisation methods which rely on evaluating the immediate neighbourhood, such as gradient search methods or simulated annealing, are likely to be slow. We have therefore developed an approach using genetic algorithms (GAs), which are generally better at searching large and complex search spaces.

Genetic algorithms
The theoretical foundations for GAs were laid by John H Holland in the early 1970s. These optimisation routines were inspired by the process of Darwinian evolution. They use the same mechanisms used by nature to find ever fitter (better adapted) creatures:
– reproduction, using the rule of ‘survival of the fittest’;
– recombination of material using a crossover operator; and
– random mutation.
Our ‘environment’ is the search space for the optimisation problem this is made up of the nCr possible combinations of r scenarios selected from the set of n. Each creature is represented by a ‘chromosome’ or string of ‘genes’, which itself represents a possible solution to the problem, ie a single subset of r scenarios. To each chromosome we ascribe a value of ‘fitness’ in our case, a measure of how well the statistical properties of the r-size subset conform to those of the larger set.
The GA starts with a randomly generated ‘population’ of creatures or chromosomes. It then creates successive generations by selecting ‘parents’, where the selection is biased so that fitter parents are more likely to be selected, and recombining the genetic material of those parents using a crossover operator to create ‘offspring’. The idea is that groups of genes that are associated with greater fitness will be preserved through the generations. Furthermore, they will be recombined with other groups of genes that are associated with greater fitness, and so (we hope) create new creatures with even greater fitness.