0

I need to create a lower triangular matrix with a special structure then do a matrix-vector multiplication.

The matrix is parameterized by a value k. It main diagonal is a vector of k ^ 0, i.e. 1; the first sub-diagonal is a vector of k ^ 1, and the i-th sub-diagonal holds k ^ i.

Here is a 5 x 5 example with k = 0.9:

structure(c(1, 0.9, 0.81, 0.729, 0.6561, 0, 1, 0.9, 0.81, 0.729, 
0, 0, 1, 0.9, 0.81, 0, 0, 0, 1, 0.9, 0, 0, 0, 0, 1), .Dim = c(5L, 5L))
#       [,1]  [,2] [,3] [,4] [,5]
#[1,] 1.0000 0.000 0.00  0.0    0
#[2,] 0.9000 1.000 0.00  0.0    0
#[3,] 0.8100 0.900 1.00  0.0    0
#[4,] 0.7290 0.810 0.90  1.0    0
#[5,] 0.6561 0.729 0.81  0.9    1

I need to construct such a matrix as large as 100,000 x 100,000 and use it for computation. I need the most efficient storage method for this. Any ideas?

Zheyuan Li
  • 71,365
  • 17
  • 180
  • 248
jakes
  • 1,964
  • 3
  • 18
  • 50
  • 1
    More precisely speaking, I need to multiply a vector `v` by this matrix from the left side (i.e. `M %*% v`) – jakes Aug 21 '18 at 15:24

2 Answers2

3

You don't always need to explicitly form a matrix to do a matrix-vector or matrix-matrix multiplication. For example, no one really forms a diagonal matrix and use it for such computations.

There is no substantial difference between your matrix and a diagonal matrix.

So you reduce the operation to a series of vector addition. Here is a trivial R-level implementation.

MatVecMul <- function (y, k) {
  n <- length(y)
  z <- numeric(n)
  for (i in 1:n) z[i:n] <- z[i:n] + k ^ (i - 1) * y[1:(n - i + 1)]
  z
  }

A comparison with direct matrix construction and computation.

d <- structure(c(1, 0.9, 0.81, 0.729, 0.6561, 0, 1, 0.9, 0.81, 0.729, 
0, 0, 1, 0.9, 0.81, 0, 0, 0, 1, 0.9, 0, 0, 0, 0, 1), .Dim = c(5L, 5L))
set.seed(0); y <- runif(5)
c(d %*% y)
#[1] 0.8966972 1.0725361 1.3374064 1.7765191 2.5070750

MatVecMul(y, 0.9)
#[1] 0.8966972 1.0725361 1.3374064 1.7765191 2.5070750

Can replace the R-level "for" loop easily with Rcpp.

library(Rcpp)
cppFunction("NumericVector MatVecMul_cpp (NumericVector y, double k) {
  int n = y.size();
  NumericVector z(n);
  int i; double *p1, *p2, *end = &z[n];
  double tmp = 1.0;
  for (i = 0; i < n; i++) {
    for (p1 = &z[i], p2 = &y[0]; p1 < end; p1++, p2++) *p1 += tmp * (*p2);
    tmp *= k;
    }
  return z;
  }")

MatVecMul_cpp(y, 0.9)
#[1] 0.8966972 1.0725361 1.3374064 1.7765191 2.5070750

Let's have a benchmark.

v <- runif(1e4)
system.time(MatVecMul(y, 0.9))
#   user  system elapsed 
#  3.196   0.000   3.198 
system.time(MatVecMul_cpp(y, 0.9))
#   user  system elapsed 
#  0.840   0.000   0.841 

One caution though: be aware of machine precision. As soon as k ^ (i - 1) becomes too small, you may lose all significant digits during addition. See R: approximating `e = exp(1)` using `(1 + 1 / n) ^ n` gives absurd result when `n` is large. In this example with k = 0.9, there is k ^ 400 = 5e-19. So even though the full matrix is 10000 x 10000, it is numerically banded than lower triangular. This means that we can actually terminate the loop earlier. But I will not implement this.

Zheyuan Li
  • 71,365
  • 17
  • 180
  • 248
  • Thanks a lot! that is really helpful. If you have a minute or two - maybe you could take a look to my related question here: https://stackoverflow.com/questions/51967783/recursive-dependence-of-data-for-loop-using-rcpp . I have to make a little tweak to your `Rcpp` function now. – jakes Aug 22 '18 at 13:20
0

Try this:

k <- 0.9
n <- 5

d <- diag(n)
replace(k ^ (row(d) - col(d)), upper.tri(d), 0)

giving:

       [,1]  [,2] [,3] [,4] [,5]
[1,] 1.0000 0.000 0.00  0.0    0
[2,] 0.9000 1.000 0.00  0.0    0
[3,] 0.8100 0.900 1.00  0.0    0
[4,] 0.7290 0.810 0.90  1.0    0
[5,] 0.6561 0.729 0.81  0.9    1
G. Grothendieck
  • 254,981
  • 17
  • 203
  • 341
  • Thanks. But I hope there is a better way as calling `diag(100000)` results in error `Error: cannot allocate vector of size 74.5 Gb` – jakes Aug 21 '18 at 15:07
  • 1
    The result is more than half that size even if you use sparse matrices so if you can't store that it is unlikely you can store the matrix you want in memory. – G. Grothendieck Aug 21 '18 at 15:09