Fastest way to multiply matrix columns with vector elements in R

Question

I have a matrix m and a vector v. I would like to multiply first column of matrix m by the first element of vector v, and multiply the second column of matrix m by the second element of vector v, and so on. I can do it with the following code, but I am looking for a way which does not require the two transpose calls. How can I do this faster in R?

m <- matrix(rnorm(120000), ncol=6)
v <- c(1.5, 3.5, 4.5, 5.5, 6.5, 7.5)

system.time(t(t(m) * v))

#   user  system elapsed 
#   0.02    0.00    0.02

Related: http://stackoverflow.com/q/3643555/946850 – krlmlr Dec 19 '16 at 09:02 — krlmlr, Dec 19 '16 at 09:02

mnel · Accepted Answer · 2013-06-13T06:28:00.063

47

Use some linear algebra and perform matrix multiplication, which is quite fast in R.

eg

m %*% diag(v)

some benchmarking

m = matrix(rnorm(1200000), ncol=6)

 v=c(1.5, 3.5, 4.5, 5.5, 6.5, 7.5)
library(microbenchmark)
microbenchmark(m %*% diag(v), t(t(m) * v))
##    Unit: milliseconds
##            expr      min       lq   median       uq      max neval
##   m %*% diag(v) 16.57174 16.78104 16.86427 23.13121 109.9006   100
##     t(t(m) * v) 26.21470 26.59049 32.40829 35.38097 122.9351   100

edited Jun 13 '13 at 06:28

answered Jun 13 '13 at 06:20

mnel

113,303
27
265
254

4

I've found that the result heavily depends on the length of `v`. For shorter `v`, the `diag()` option is faster, but eventually the double transpose wins. – krlmlr Dec 19 '16 at 09:08
@krlmlr - How are you defining "short" (<= 10 columns?); would be helpful to know, for progeny's sake. – NickBraunagel Dec 09 '19 at 16:44

score 27 · Answer 2 · edited Sep 11 '17 at 11:51

If you have a larger number of columns your t(t(m) * v) solution outperforms the matrix multiplication solution by a wide margin. However, there is a faster solution, but it comes with a high cost in in memory usage. You create a matrix as large as m using rep() and multiply elementwise. Here's the comparison, modifying mnel's example:

m = matrix(rnorm(1200000), ncol=600)
v = rep(c(1.5, 3.5, 4.5, 5.5, 6.5, 7.5), length = ncol(m))
library(microbenchmark)

microbenchmark(t(t(m) * v), 
  m %*% diag(v),
  m * rep(v, rep.int(nrow(m),length(v))), 
  m * rep(v, rep(nrow(m),length(v))), 
  m * rep(v, each = nrow(m)))

# Unit: milliseconds
#                                   expr        min         lq       mean     median         uq       max neval
#                            t(t(m) * v)  17.682257  18.807218  20.574513  19.239350  19.818331  62.63947   100
#                          m %*% diag(v) 415.573110 417.835574 421.226179 419.061019 420.601778 465.43276   100
#  m * rep(v, rep.int(nrow(m), ncol(m)))   2.597411   2.794915   5.947318   3.276216   3.873842  48.95579   100
#      m * rep(v, rep(nrow(m), ncol(m)))   2.601701   2.785839   3.707153   2.918994   3.855361  47.48697   100
#             m * rep(v, each = nrow(m))  21.766636  21.901935  23.791504  22.351227  23.049006  66.68491   100

As you can see, using "each" in rep() sacrifices speed for clarity. The difference between rep.int and rep seems to be neglible, both implementations swap places on repeated runs of microbenchmark(). Keep in mind, that ncol(m) == length(v).

Note that the double transpose also duplicates the matrix at least once, not sure if the memory usage is much better than just expanding the matrix, The expansion itself can be done slightly more readable using `matrix(v, nrow = nrow(m), ncol = ncol(m), byrow = TRUE)` . — krlmlr, Dec 19 '16 at 09:14
About the `rep` solution you write "... comes with a high cost in in memory usage". Doesn't `t(m)` incur the same cost, because this creates a new matrix with the same number of elements as `m`? — jochen, May 21 '17 at 10:50

thelatemail · Answer 3 · 2013-06-13T06:18:19.213

4

As @Arun points out, I don't know that you'll beat your solution in terms of time efficiency. In terms of code understandability, there are other options though:

One option:

> mapply("*",as.data.frame(m),v)
      V1  V2  V3
[1,] 0.0 0.0 0.0
[2,] 1.5 0.0 0.0
[3,] 1.5 3.5 0.0
[4,] 1.5 3.5 4.5

And another:

sapply(1:ncol(m),function(x) m[,x] * v[x] )

edited Jun 13 '13 at 06:18

answered Jun 13 '13 at 06:02

thelatemail

91,185
12
128
188

I doubt this'll be faster than working on matrices (your first solution, especially). – Arun Jun 13 '13 at 06:11
1

when I check the system.time for large sample there is no difference between them and it is not faster. – rose Jun 13 '13 at 06:11
@rose - Despite providing alternatives, I agree with Arun here. I'm not sure what is wrong with the `t(t(..` solution – thelatemail Jun 13 '13 at 06:12
Is their an implementation for tensor (outer) product? I think that's what he's going for. – Tommy Levi Jun 13 '13 at 06:12

score 4 · Answer 4 · answered Mar 05 '18 at 13:06

For the sake of completeness, I added sweep to the benchmark. Despite its somewhat misleading attribute names, I think it may be more readable than other alternatives, and also quite fast:

n = 1000
M = matrix(rnorm(2 * n * n), nrow = n)
v = rnorm(2 * n)

microbenchmark::microbenchmark(
  M * rep(v, rep.int(nrow(M), length(v))),
  sweep(M, MARGIN = 2, STATS = v, FUN = `*`),
  t(t(M) * v),
  M * rep(v, each = nrow(M)),
  M %*% diag(v)
)

Unit: milliseconds
                                       expr         min          lq        mean
    M * rep(v, rep.int(nrow(M), length(v)))    5.259957    5.535376    9.994405
 sweep(M, MARGIN = 2, STATS = v, FUN = `*`)   16.083039   17.260790   22.724433
                                t(t(M) * v)   19.547392   20.748929   29.868819
                 M * rep(v, each = nrow(M))   34.803229   37.088510   41.518962
                              M %*% diag(v) 1827.301864 1876.806506 2004.140725
      median          uq        max neval
    6.158703    7.606777   66.21271   100
   20.479928   23.830074   85.24550   100
   24.722213   29.222172   92.25538   100
   39.920664   42.659752  106.70252   100
 1986.152972 2096.172601 2432.88704   100

score 2 · Answer 5 · answered Mar 05 '17 at 14:17

As done by bluegrue, a simple rep would suffice as well to perform element-wise multiplication.

The number of multiplications and summations is reduced by a wide-margin as if simple matrix multiplication with diag() is performed, where for this case a lot of zero-multiplications can be avoided.

m = matrix(rnorm(1200000), ncol=6)
v=c(1.5, 3.5, 4.5, 5.5, 6.5, 7.5)
v2 <- rep(v,each=dim(m)[1])
library(microbenchmark)
microbenchmark(m %*% diag(v), t(t(m) * v), m*v2)

Unit: milliseconds
          expr       min        lq      mean    median        uq      max neval cld
 m %*% diag(v) 11.269890 13.073995 16.424366 16.470435 17.700803 95.78635   100   b
   t(t(m) * v)  9.794000 11.226271 14.018568 12.995839 15.010730 88.90111   100   b
        m * v2  2.322188  2.559024  3.777874  3.011185  3.410848 67.26368   100  a

score 1 · Answer 6 · answered Aug 29 '19 at 21:02

Sparse Matrices

If you're working with sparse matrices, it's likely element-wise methods (i.e., expanding v to be a dense matrix) could exceed memory constraints, so I'm going to exclude those proposed methods from this analysis.

The Matrix package provides a Diagonal method that tremendously improves the diagonal multiplication approach.

Benchmarks

library(Matrix)
library(microbenchmark)

N_ROW = 5000
N_COL = 10000

M <- rsparsematrix(N_ROW, N_COL, density=0.1)
v <- rnorm(N_COL)

microbenchmark(
  M %*% Diagonal(length(v), v),
  t(t(M) * v), 
  sweep(M, MARGIN=2, STATS=v, FUN='*'))

# Unit: milliseconds
#                                        expr        min         lq       mean     median         uq       max neval
#                M %*% Diagonal(length(v), v)   36.46755   39.03379   47.75535   40.41116   43.30241  248.1036   100
#                                 t(t(M) * v)  207.70560  223.35126  269.08112  250.25379  284.49382  511.0403   100
#  sweep(M, MARGIN = 2, STATS = v, FUN = "*") 1682.45263 1869.87220 1941.54691 1924.80218 1999.62484 3104.8305   100

Fastest way to multiply matrix columns with vector elements in R

6 Answers6

Sparse Matrices

Benchmarks

Linked

Related