Generate multivariate normal r.v.'s with rank-deficient covariance via Pivoted Cholesky Factorization

Question

I'm just beating my head against the wall trying to get a Cholesky decomposition to work in order to simulate correlated price movements.

I use the following code:

cormat <- as.matrix(read.csv("http://pastebin.com/raw/qGbkfiyA"))
cormat <- cormat[,2:ncol(cormat)]
rownames(cormat) <- colnames(cormat)
cormat <- apply(cormat,c(1,2),FUN = function(x) as.numeric(x))

chol(cormat)
#Error in chol.default(cormat) : 
#    the leading minor of order 8 is not positive definite

cholmat <- chol(cormat, pivot=TRUE)
#Warning message:
#    In chol.default(cormat, pivot = TRUE) :
#    the matrix is either rank-deficient or indefinite

rands <- array(rnorm(ncol(cholmat)), dim = c(10000,ncol(cholmat)))
V <- t(t(cholmat) %*% t(rands))

#Check for similarity
cor(V) - cormat  ## Not all zeros!

#Check the standard deviations
apply(V,2,sd) ## Not all ones!

I'm not really sure how to properly use the pivot = TRUE statement to generate my correlated movements. The results look totally bogus.

Even if I have a simple matrix and I try out "pivot" then I get bogus results...

cormat <- matrix(c(1,.95,.90,.95,1,.93,.90,.93,1), ncol=3)

cholmat <- chol(cormat)
# No Error

cholmat2 <- chol(cormat, pivot=TRUE)
# No warning... pivot changes column order

rands <- array(rnorm(ncol(cholmat)), dim = c(10000,ncol(cholmat)))
V <- t(t(cholmat2) %*% t(rands))

#Check for similarity
cor(V) - cormat  ## Not all zeros!

#Check the standard deviations
apply(V,2,sd) ## Not all ones!

My first guess is that `cormat` is indeed p.d. but has some eigenvalues which are nearly zero and that causes numerical trouble. You can prove or disprove that by calculating the eigenvalues. By the way, what's the origin of `cormat`? If you have some control over it, can you ensure that `cormat` is "more" positive definite? (e.g. add a constant factor to the diagonal, or construct it in a way which guarantees p.d.-ness.) — Robert Dodier, Feb 11 '16 at 20:36
Some of the eigenvalues are indeed negative. I am suspect this is because the prices from which the correlations were taken were done so at different periods. For examples S1 <-> S2 is taken from a different period as S2 <-> S3. I am able to modify cormat as I see fit. I tried to zero out the negative eigenvalues, but was unsuccessful using this as a guide. http://www.risklatte.com/Articles/QuantitativeFinance/QF146.php — JoeBass, Feb 11 '16 at 21:37
My advice at this point is to construct a smaller matrix and see that your method works. I gather that the elements in the matrix are temporal correlations, with S(i, j) = (correlation j - i time steps apart). If so maybe all elements on a given sub- or super-diagonal should be equal, right? and one would expect the values to get smaller as they are farther away from the diagonal, right? If you make the off-diagonals small enough, the matrix is guaranteed to be p.d. (so-called "diagonal dominance"). Anyway if you get it working with a small, constructed matrix you can go back to the original. — Robert Dodier, Feb 11 '16 at 21:53

Zheyuan Li · Accepted Answer · 2016-09-03T08:52:51.187

There are two errors with your code:

You did not use pivoting index to revert the pivoting done to the Cholesky factor. Note, pivoted Cholesky factorization for a semi-positive definite matrix A is doing:
```
P'AP = R'R
```
where P is a column pivoting matrix, and R is an upper triangular matrix. To recover A from R, we need apply the inverse of P (i.e., P'):
```
A = PR'RP' = (RP')'(RP')
```
Multivariate normal with covariance matrix A, is generated by:
```
XRP'
```
where X is multivariate normal with zero mean and identity covariance.
Your generation of X
```
X <- array(rnorm(ncol(R)), dim = c(10000,ncol(R)))
```
is wrong. First, it should not be ncol(R) but nrow(R), i.e., the rank of X, denoted by r. Second, you are recycling rnorm(ncol(R)) along columns, and the resulting matrix is not random at all. Therefore, cor(X) is never close to an identity matrix. The correct code is:
```
X <- matrix(rnorm(10000 * r), 10000, r)
```

As a model implementation of the above theory, consider your toy example:

A <- matrix(c(1,.95,.90,.95,1,.93,.90,.93,1), ncol=3)

We compute the upper triangular factor (suppressing possible rank-deficient warnings) and extract inverse pivoting index and rank:

R <- suppressWarnings(chol(A, pivot = TRUE))
piv <- order(attr(R, "pivot"))  ## reverse pivoting index
r <- attr(R, "rank")  ## numerical rank

Then we generate X. For better result we centre X so that column means are 0.

X <- matrix(rnorm(10000 * r), 10000, r)
## for best effect, we centre `X`
X <- sweep(X, 2L, colMeans(X), "-")

Then we generate target multivariate normal:

## compute `V = RP'`
V <- R[1:r, piv]

## compute `Y = X %*% V`
Y <- X %*% V

We can verify that Y has target covariance A:

cor(Y)
#          [,1]      [,2]      [,3]
#[1,] 1.0000000 0.9509181 0.9009645
#[2,] 0.9509181 1.0000000 0.9299037
#[3,] 0.9009645 0.9299037 1.0000000

A
#     [,1] [,2] [,3]
#[1,] 1.00 0.95 0.90
#[2,] 0.95 1.00 0.93
#[3,] 0.90 0.93 1.00

Obvious... to who? – Brandon Bertelsen Sep 03 '16 at 08:46 — Brandon Bertelsen, Sep 03 '16 at 08:46
It was more of a giggle on my part. :) – Brandon Bertelsen Sep 03 '16 at 08:48 — Brandon Bertelsen, Sep 03 '16 at 08:48

Generate multivariate normal r.v.'s with rank-deficient covariance via Pivoted Cholesky Factorization

1 Answers1

Linked