Correlation matrices arise in many applications to model the dependence between variables. Dan Georgescu and Nick Higham ask what happens when we have a partially specified matrix and we wish to fill in the missing elements
In linear algebra terms, a correlation matrix is a symmetric positive semidefinite (PSD) matrix with unit diagonal. In other words, it is a symmetric matrix with ones on the diagonal whose eigenvalues are all non-negative. Eigenvalues may seem to be an unnecessary complication, but they determine whether or not a given matrix with ones on the diagonal is a correlation matrix. For example,
is not a correlation matrix: it has eigenvalues -0.4142, 1, 2.4142. Since a correlation matrix is a scaled covariance matrix, a negative eigenvalue would suggest that one of the variables has a negative variance, which cannot happen. A matrix having positive eigenvalues is the matrix equivalent of a real number being non-negative. We also need our correlation matrices to have this property because capital models reasonably expect inputs of positive variances and simulate possible future states of the world by first calculating the square root of the correlation matrix.
Why we need correlation matrix completion
In practice, there is often incomplete or missing information for the variables and this may lead to missing values in the correlation matrix itself, hence the problem of how to complete the matrix. We show that some of these practical problems can be solved explicitly via simple formulae, and explain how to use mathematical tools to solve the more general problem where explicit solutions may not exist. (‘Simple’ is, of course, a relative term.)
As an example, consider the 3-by-3 matrix A above and suppose the zeroes in the top-right and bottom-left corners are omitted and these elements have to be chosen. Can we find a value for these entries that produces a valid correlation matrix? The only such value is 1, which can be seen from the requirement that a correlation matrix must have a non-negative determinant (the determinant being the product of the eigenvalues).
Correlation matrices are used in the aggregation of risk exposures required by regulations. The values of the correlations really matter because, for many insurers, aggregate diversification effects will be the most significant determinant of required capital, with a 40%-60% reduction in the capital required between risk types not being atypical for a large, well-diversified insurer or reinsurer.
Some of the correlations are known because they have been estimated from data, prescribed by regulations or assigned by expert judgment, but the other entries are not known. This could be because there is little reliable data, no specific regulations and few available experts with knowledge of both of the risks being correlated. The aim is to complete the missing entries in order to produce a valid correlation matrix to calculate a capital requirement as realistically as is possible given the absence of hard evidence.
In general there are many possible completions (our simple 3-by-3 example is unusual in having only one solution), and the choice of which to use is important because correlations determine the diversification between capital for the various risk variables.
For an example, consider the following simplified problem, which is included to illustrate the issue but is not supposed to represent a realistic calibration. A firm is exposed to three risks, a, b and c, and is organised in two business units, BU1 and BU2. Risk a represents corporate bond spread in BU1 and BU2. Risk b reflects, say, equity risk. Risk c in BU1 is an exposure to an asset class that is not traded and therefore has no market data history from which to calibrate a correlation. All the correlation coefficients are known in BU1 and BU2 (the upper left and bottom right of the matrix) but not between risk c in BU1 and the risks in BU2. This could be because the firm operates in two markets (say the UK and Italy, corresponding to BU1 and BU2) and UK experts were able to advise on the correlations between risk c and risks a and b by making an expert judgment. However, no expert judgment could be made on the correlation between risk c in BU1 and the risks in BU2. Typically, missing correlations arise between risks in different business units because there are few experts who understand both businesses and can make these judgments.
For the partially specified matrix given in Figure 1, a valid correlation matrix completion must lie in the dark yellow region in Figure 2. The centre of this region is the maximum determinant completion, where x is 0.72 and y is 0.64, to two decimal places. In that sense, the maximum determinant completion is unbiased. To give an easy-to-interpret number, if the capital held for the risks in BU1 was £10m, £20m and £70m respectively, and the capital held for the risks in BU2 was £50m and £30m, then using these values for the missing correlations and applying the Solvency II standard formula for capital calculation would lead to a required capital of just under £159m.
Other completion frameworks are possible. For example, we could find those completions that maximise and minimise capital requirement, in order to understand the materiality of the uncertainty around a central completion. These will depend on the framework used to calculate capital, but example values are plotted in Figure 2 to show the diversity of potential answers. For example, the least and most onerous completions would lead to a required capital of either £150m or £167m, an uncertainty interval of 10% around the £159m calculated above. Or we could complete such that the matrix is the most stable, where stability might be defined by reference to the positive semidefiniteness requirements, so that we might want to find the matrix where the smallest eigenvalue is maximised.
In practical applications, large correlation matrices are often considered in small 3×3 subsets with only one unknown correlation. In these cases practitioners use rules of thumb to complete the subsets, such as the ‘product rule’ that a reasonable estimate of the missing correlation is the product of the two known correlations. However, these rules tend to lead to non-PSD matrices which then have to be ‘repaired’ by computing the nearest correlation matrix. The added problem in this case is that it is not clear which 3×3 subset is most relevant for the missing correlation x, say:
giving a choice of two product rule solutions of x = 0.6*0.5 = 0.3 (which is outside the feasible bound in
the picture above) or x = 0.85*0.85 = 0.72. The second value looks similar to the maximum determinant (MaxDet) completion, but this is a coincidence partly due to the rounding. An alternative approach has been
to average these two values, but we can see that x = 1/2*(0.3 + 0.72) = 0.51 understates the central value
for the feasible region in dark yellow in Figure 2.
Explicit solutions in certain cases
Our peer-reviewed research shows how to complete the correlation matrix in certain cases such as the one above, for the MaxDet framework. The framework has several useful theoretical properties.
- Existence and uniqueness: if PSD completions exist then there is exactly one MaxDet completion.
- Maximum entropy model: MaxDet is the maximum entropy completion for the multivariate normal model, where maximum entropy is a principle of favouring the simplest explanations. In the absence of other explanations, we should choose this principle for the null hypothesis in Bayesian analysis.
- Maximum likelihood estimation: MaxDet is the maximum likelihood estimate of the correlation matrix of the unknown underlying multivariate Normal model.
- Analytic centre: MaxDet is the analytic centre of the feasible region described by the positive semidefiniteness constraints, where this is defined as the point that maximises the product of distances to the defining hyperplanes.
In the simple case above, we can express the correlation matrix pattern above as blocks,
where block E is unspecified. The MaxDet completion is shown to be,
which is an explicit solution expressed as a matrix operation and simple to translate into an Excel calculation. Our work then extends this idea to other similar cases where the unknown entries can be grouped into particular block patterns. For other patterns, such simple solutions may not exist and the problem becomes a difficult nonlinear optimisation problem.
While the MaxDet completion has attractive mathematical properties, the MaxDet solution coincides with the conditionally independent solution, as discussed in the research paper. MaxDet does not add any more dependence to the correlation matrix, which may be unrealistic for some risk pairs.
More general cases
To solve the problem in more general cases than the simple pattern of specified and unspecified entries above where explicit solutions might not exist, one approach is to use an interior point algorithm that has been adapted to search through the feasible region determined by the space of PSD matrices. Many such algorithms exist (see for example the semidefinite programming solvers at bit.ly/311DmgS), and we used SDPT3. One of the drawbacks for those without a background in mathematical optimisation is that specifying the inputs to such solvers is laborious and error-prone. Fortunately, tools exist that parse the problem specified as above to produce inputs that are suitable for most free and commercial solvers. The parser sed is called YALMIP and is freely available, but we are aware of alternatives such as CVX. YALMIP has a MATLAB implementation. After installing a solver and YALMIP, the problem can be specified to YALMIP as seen in Figure 3.
In this example, we have specified two objective functions. The first objective function reproduces our MaxDet solution using SDPT3. As the code takes up to a second to run depending on the speed of the particular computer used and the precision is only six decimal places by default, there is little advantage to using approximate algorithms where explicit solutions exist.
The second objective function solves the problem – see Figure 4. This second objective function is relevant to our insurance example because under certain simplifying assumptions, capital is calculated using the formula
where V is a vector of capital held for various risks and Σ is a correlation matrix. We have used the standard convention that the objective is specified as a minimisation (minimising a negative is the same as maximising), and left out the square root in the objective function (as it is not required for the optimal solution). The second line means that Σ has to be PSD.
This article is based on an earlier version, which appeared in Bank Underground at bit.ly/3jT5J8o
“There is often incomplete or missing information for the variables and this may lead to missing values in the correlation matrix itself, hence the problem of how to complete the matrix”
Dan Georgescu is deputy chief actuary at Just
Nick Higham is a Royal Society Research Professor of Applied Mathematics at the University of Manchester