How to build & store this large lower triangular matrix for matrix-vector multiplication?

Question

I need to create a lower triangular matrix with a special structure then do a matrix-vector multiplication.

The matrix is parameterized by a value k. It main diagonal is a vector of k ^ 0, i.e. 1; the first sub-diagonal is a vector of k ^ 1, and the i-th sub-diagonal holds k ^ i.

Here is a 5 x 5 example with k = 0.9:

structure(c(1, 0.9, 0.81, 0.729, 0.6561, 0, 1, 0.9, 0.81, 0.729, 
0, 0, 1, 0.9, 0.81, 0, 0, 0, 1, 0.9, 0, 0, 0, 0, 1), .Dim = c(5L, 5L))
#       [,1]  [,2] [,3] [,4] [,5]
#[1,] 1.0000 0.000 0.00  0.0    0
#[2,] 0.9000 1.000 0.00  0.0    0
#[3,] 0.8100 0.900 1.00  0.0    0
#[4,] 0.7290 0.810 0.90  1.0    0
#[5,] 0.6561 0.729 0.81  0.9    1

I need to construct such a matrix as large as 100,000 x 100,000 and use it for computation. I need the most efficient storage method for this. Any ideas?

More precisely speaking, I need to multiply a vector `v` by this matrix from the left side (i.e. `M %*% v`) — jakes, Aug 21 '18 at 15:24

Zheyuan Li · Accepted Answer · 2018-08-22T01:34:39.183

You don't always need to explicitly form a matrix to do a matrix-vector or matrix-matrix multiplication. For example, no one really forms a diagonal matrix and use it for such computations.

There is no substantial difference between your matrix and a diagonal matrix.

So you reduce the operation to a series of vector addition. Here is a trivial R-level implementation.

MatVecMul <- function (y, k) {
  n <- length(y)
  z <- numeric(n)
  for (i in 1:n) z[i:n] <- z[i:n] + k ^ (i - 1) * y[1:(n - i + 1)]
  z
  }

A comparison with direct matrix construction and computation.

d <- structure(c(1, 0.9, 0.81, 0.729, 0.6561, 0, 1, 0.9, 0.81, 0.729, 
0, 0, 1, 0.9, 0.81, 0, 0, 0, 1, 0.9, 0, 0, 0, 0, 1), .Dim = c(5L, 5L))
set.seed(0); y <- runif(5)
c(d %*% y)
#[1] 0.8966972 1.0725361 1.3374064 1.7765191 2.5070750

MatVecMul(y, 0.9)
#[1] 0.8966972 1.0725361 1.3374064 1.7765191 2.5070750

Can replace the R-level "for" loop easily with Rcpp.

library(Rcpp)
cppFunction("NumericVector MatVecMul_cpp (NumericVector y, double k) {
  int n = y.size();
  NumericVector z(n);
  int i; double *p1, *p2, *end = &z[n];
  double tmp = 1.0;
  for (i = 0; i < n; i++) {
    for (p1 = &z[i], p2 = &y[0]; p1 < end; p1++, p2++) *p1 += tmp * (*p2);
    tmp *= k;
    }
  return z;
  }")

MatVecMul_cpp(y, 0.9)
#[1] 0.8966972 1.0725361 1.3374064 1.7765191 2.5070750

Let's have a benchmark.

v <- runif(1e4)
system.time(MatVecMul(y, 0.9))
#   user  system elapsed 
#  3.196   0.000   3.198 
system.time(MatVecMul_cpp(y, 0.9))
#   user  system elapsed 
#  0.840   0.000   0.841

One caution though: be aware of machine precision. As soon as k ^ (i - 1) becomes too small, you may lose all significant digits during addition. See R: approximating `e = exp(1)` using `(1 + 1 / n) ^ n` gives absurd result when `n` is large. In this example with k = 0.9, there is k ^ 400 = 5e-19. So even though the full matrix is 10000 x 10000, it is numerically banded than lower triangular. This means that we can actually terminate the loop earlier. But I will not implement this.

Thanks a lot! that is really helpful. If you have a minute or two - maybe you could take a look to my related question here: https://stackoverflow.com/questions/51967783/recursive-dependence-of-data-for-loop-using-rcpp . I have to make a little tweak to your `Rcpp` function now. — jakes, Aug 22 '18 at 13:20

score 0 · Answer 2 · answered Aug 21 '18 at 15:05

0

Try this:

k <- 0.9
n <- 5

d <- diag(n)
replace(k ^ (row(d) - col(d)), upper.tri(d), 0)

giving:

       [,1]  [,2] [,3] [,4] [,5]
[1,] 1.0000 0.000 0.00  0.0    0
[2,] 0.9000 1.000 0.00  0.0    0
[3,] 0.8100 0.900 1.00  0.0    0
[4,] 0.7290 0.810 0.90  1.0    0
[5,] 0.6561 0.729 0.81  0.9    1

answered Aug 21 '18 at 15:05

G. Grothendieck

254,981
17
203
341

Thanks. But I hope there is a better way as calling `diag(100000)` results in error `Error: cannot allocate vector of size 74.5 Gb` – jakes Aug 21 '18 at 15:07
1

The result is more than half that size even if you use sparse matrices so if you can't store that it is unlikely you can store the matrix you want in memory. – G. Grothendieck Aug 21 '18 at 15:09

How to build & store this large lower triangular matrix for matrix-vector multiplication?

2 Answers2

Linked