[Skip to content]

Sign up for our daily newsletter
The Actuary The magazine of the Institute & Faculty of Actuaries

Ask the experts

Raveem Ismail and Scott Reid propose that structured expert judgment can be used to significantly reduce uncertainty in risk appraisal when considering areas such as political violence


Structured expert judgment is an auditable and objective combination of multiple judgments,
each weighted
by its skill
in gauging uncertainty
In considering how, as organisations and individuals, we make decisions, we first refer to an insightful quote: “There are no hard facts, just endless opinions. Every day, the news media deliver forecasts without reporting, or even asking, how good the forecasters who made the forecasts really are. Every day, corporations and governments pay for forecasts that may be prescient, worthless, or something in between. And every day, all of us – leaders of nations, corporate executives, investors, and voters – make critical decisions on the basis of forecasts whose quality is unknown.” Superforecasting: The Art and Science of Prediction, Tetlock & Gardner, 2015

It would be preferable for all decision-aiding models to be based on objective criteria such as exhaustive data and sound physical principles.

This ideal situation rarely occurs, and (re)insurance decisionmakers frequently act in data-poor environments, relying heavily on expert judgment.
This occurs particularly in low-frequency high-severity/loss areas, such as life and health and care, as well as in unusual, rare and catastrophic risk appraisal. Given that Solvency II requires assessment of 1-in-200-year events, the regulatory capital regime across the EU is also based on the application of expert judgment.

Decisionmakers can and should demand the most unbiased expert judgment procedures, with objective criteria to appraise expert performance. But how?
Referencing one actual study, we discuss one approach, used in other fields but not yet in (re)insurance. This is structured expert judgment (SEJ), which is an auditable and objective combination of multiple judgments, each weighted by its skill in gauging uncertainty. This produces a better overall judgment within a plausible range of outcomes.

Figure 1
Figure 1

Expert opinion
Consulting 10 experts will yield 10 different answers. Each answer is an (unknowable) function of an expert’s previous experience, grasp of data, judgmental capability, biases or mood on the day.
Without a method of selecting between so many different judgments, the customer (insurance company) often simply sticks with what they know best: a longstanding provider or market reputation. None of these is any indicator of capability: the client cannot know the quality, since no performance-based appraisal of forecasting ability has occurred.

While a single expert’s judgment might be an outlier, any simple averaging leads to limited gains.
As each expert is weighted equally, without regard for capability, the final answer may actually be less accurate than some individual answers, owing to outliers.

SEJ differs from, and extends, previous opinion pooling methods. Each expert is first rated with regard to prior performance by being asked a set of seed questions to which the answer is already known to the elicitation facilitator but not necessarily to the expert. Each expert’s performance on these seed questions ascertains their weighting. They are then asked the target questions; the actual judgments being sought, to which answers are not known.

Weightings drawn from the seed questions are then used to combine the experts’ judgments on the target questions, producing one outcome that truly combines different expert judgments in a way that is performance-based, and is thus potentially better than each individual answer. The design of seed questions is critical: seed questions must be chosen for their tight alignment with the target questions, testing the same ability required for target questions and thus maximising the utility of the performance weighting.

Figure 2
Figure 3
Figure 3

Frequency of political violence
Cooke’s classical model for SEJ involves asking each expert for two metrics: a confidence interval between which they think the true value lies (5% to 95%); and a central median value.
These are then used to calculate how well the expert gauges uncertainty spreads (information), and how reliably they capture true values within their ranges (statistical accuracy or calibration).

Under the European Cooperation in Science and Technology framework, a network was formed and first elicitation performed in January 2016, with 18 seed and eight target questions. This was for an inherently unknowable future metric: the 2016 frequency of strikes, riots and civil commotion (SR&CC) in blocs of countries (Central Asia, Maghreb), with participants drawn from across the (re)insurance profession. An example of their judgments on a single seed question (related to prior SR&CC events in South-East Asia) is shown in Figure 1 (above). Experts produced a variety of median values and ranges, some having tightly bound ranges that captured the true value (dotted line).

Figure 2 (above) shows information and calibration scores across the full seed question set. Two experts (experts one and four) emerge with notably strong performance-based weights. If all experts were weighted equally (last column), this discovered capability would be diluted away (‘equal-weighted’ row, table foot). However, if the experts’ judgments are combined using the weights from the calibration exercise (penultimate column), then a combination emerges that capitalises on these high-performance experts to produce better results than all of them (‘performance-weighted’ row at table foot).

When this performance-weighted combination is used for a target question, the result can be seen in Figure 3 (above). For this forward-looking question, there is no known answer, yet we see that the performance-weighted process has allowed the influence of experts one and four to provide a much tighter and more informative judgment than would most individual experts, or the equal-weighted combination (which is inflated by outliers).

For the performance-weighted combination, outliers are ameliorated and identified experts given more weight. Such a final frequency, with associated range, could now feed a pricing or catastrophe model with greater assurance than customary approaches. Structured expert judgment is still judgment. But it is not guesswork. It is a transparent method of pooling multiple opinions, weighted according to performance criteria aligned to the actual judgments being sought. Where data or models are lacking, it forms an objective and auditable method of producing decision-making judgments and inputs to models.

We have described a first SEJ elicitation in our area of interest, where this method has been shown to identify and outperform uncalibrated methods.

It should be noted that SEJ is not a silver bullet. Where there are science-based models or suitable data, these should trump expert judgment (or be used in tandem). But in their absence, in classes of business such as political violence, and for situations where tail risk is being gauged, SEJ would look to naturally provide significant enhancement to decision-making and risk appraisal.

Dr Raveem Ismail is a specialty treaty underwriter at Ariel Re, Bermuda

Scott Reid is head of pricing and reinsurance at AIG Life, UK