3

Here's what I want to do: I have two matrices A and B of dimensions N x k1 and N x k2. I now want to pointwise multiply each column of the matrix A with B.

Implementation one does this in a for loop.

For speed optimization purposes, I considered to vectorize the entire operation - but it turns out vectorization (as I have implemented it, via kronecker products) did not improve my runtime for larger problems.

Does anyone have a suggestion how to differently implement this operation, having runtime in mind?

The code below starts with a small example, then implements a loop-based and vectorized solution, then benchmarks on a larger problem.

# toy example: 

N <- 5
k1 <- 2
k2 <- 3


A <- matrix(rnorm(N*k1), N, k1)
B <- matrix(rnorm(N*k2), N, k2)

colmat_prod <- function(x, y){
  
  k2 <- ncol(y)
  k1 <- ncol(x)
  res <- array(NA, c(N, k2 , k1))
  for(i in 1:k1){
    res[, ,i] <- x[,i] * y
  }
  res
}

colmat_prod_vec <- function(x, y){
  k1 <- ncol(x)
  res_vec <- c(x) * (rep(1, k1) %x% y)
  res_vec
}

colmat_prod(A, B)
colmat_prod_vec(A, B)

# > colmat_prod(A, B)
# , , 1
# 
# [,1]        [,2]        [,3]
# [1,]  1.95468879  0.55206339  0.24713400
# [2,] -0.02678564 -0.03762645 -0.03144102
# [3,]  0.30964437  0.26912771 -0.49451656
# [4,] -1.40719543  0.77245522 -0.47236888
# [5,] -1.71485558  0.98348809  0.16569915
# 
# , , 2
# 
# [,1]        [,2]        [,3]
# [1,]  1.60358991  0.45290242  0.20274409
# [2,] -0.21009808 -0.29513001 -0.24661348
# [3,]  0.04069121  0.03536681 -0.06498577
# [4,] -2.89562745  1.58950383 -0.97200734
# [5,] -1.59504293  0.91477425  0.15412217
# 
# > colmat_prod_vec(A, B)
# [,1]        [,2]        [,3]
# [1,]  1.95468879  0.55206339  0.24713400
# [2,] -0.02678564 -0.03762645 -0.03144102
# [3,]  0.30964437  0.26912771 -0.49451656
# [4,] -1.40719543  0.77245522 -0.47236888
# [5,] -1.71485558  0.98348809  0.16569915
# [6,]  1.60358991  0.45290242  0.20274409
# [7,] -0.21009808 -0.29513001 -0.24661348
# [8,]  0.04069121  0.03536681 -0.06498577
# [9,] -2.89562745  1.58950383 -0.97200734
# [10,] -1.59504293  0.91477425  0.15412217


# speed: 
N <- 10000
k1 <- 1000
k2 <- 9
A1 <- matrix(rnorm(N*k1), N, k1)
B1 <- matrix(rnorm(N*k2), N, k2)

library(microbenchmark)
microbenchmark(colmat_prod(A1, B1), 
               colmat_prod_vec(A1, B1), 
               times = 10)
#Unit: seconds
#expr      min       lq      mean   median uq      max neval
#colmat_prod(A1, B1) 1.981737 2.179122  2.769812  2.32343 2.680407  4.96276    10
#colmat_prod_vec(A1, B1) 9.779629 9.955576 10.291264 10.21356 10.380702 11.70494    10
Konrad Rudolph
  • 530,221
  • 131
  • 937
  • 1,214
A.Fischer
  • 596
  • 5
  • 11

1 Answers1

3

You can try apply(A, 2, `*`, B) and to come the the same like colmat_prod use array(apply(A, 2, `*`, B), c(dim(B), ncol(A))):

identical(array(apply(A, 2, `*`, B), c(dim(B), ncol(A))), colmat_prod(A, B))
#[1] TRUE

Another option is to use rep for the columns of A:

array(A[,rep(seq_len(ncol(A)), each=ncol(B))] * as.vector(B), c(dim(B), ncol(A)))

Timings:

library(microbenchmark)
microbenchmark(colmat_prod(A1, B1), 
               colmat_prod_vec(A1, B1),
               array(apply(A1, 2, `*`, B1), c(dim(B1), ncol(A1))),
               array(A1[,rep(seq_len(ncol(A1)), each=ncol(B1))] * as.vector(B1), c(dim(B1), ncol(A1))),
               times = 10)
#Unit: milliseconds
#                                                                                            expr      min        lq      mean    median        uq       max neval  cld
#                                                                             colmat_prod(A1, B1) 831.5437  857.0305  910.5694  878.6842  999.5354 1025.0915    10   c 
#                                                                         colmat_prod_vec(A1, B1) 981.9241 1010.9482 1174.1700 1162.7004 1319.3478 1444.6158    10    d
#                                              array(apply(A1, 2, "*", B1), c(dim(B1), ncol(A1))) 716.1469  725.7862  765.4987  732.2520  789.3843  907.4417    10  b  
# array(A1[, rep(seq_len(ncol(A1)), each = ncol(B1))] * as.vector(B1),      c(dim(B1), ncol(A1))) 404.8460  406.2848  430.4043  428.2685  458.9400  462.0634    10 a   
GKi
  • 37,245
  • 2
  • 26
  • 48
  • Using double quotes for names is discouraged, use backtick quotes instead: \`*\`. – Konrad Rudolph Apr 26 '21 at 13:14
  • @KonradRudolph If you have a German keyboard, like I have, you have to press shift and than twice on the key of backtick quotes to get it. For double quotes you need only press it once. – GKi Apr 26 '21 at 13:29
  • I guess that’s why people dislike German keyboards for programming …. But I hope you’re not using the same reasoning to avoid using `{…}`. ;-) I’m still keeping my fingers crossed that R will at some point officially deprecated string quoting for names (never gonna happen though). It causes unnecessary confusion (not just) for beginners. – Konrad Rudolph Apr 26 '21 at 13:40