On a relatively large 5000x5000 data table, a for
loop over columns using set
is the fastest method I could find. Here are the other methods I tried, taken from Multiply rows of matrix by vector. Methods are sorted in order of performance, though the last two are nearly indistinguishable at this scale.
## sample data
nr = 5000
nc = 5000
set.seed(47)
raw_matrix = matrix(rpois(nr * nc, lambda = 10), nrow = nr)
vec = rpois(nc, lambda = 2)
## For loop with set
# reset the data table
x = as.data.table(raw_matrix)
t0 = Sys.time()
for (col in 1:ncol(x)) set(x, j = col, value = x[[col]] * vec[col])
(set_time = Sys.time() - t0)
# Time difference of 0.151 secs
## Transpose and multiply
# reset the data table
x = as.data.table(raw_matrix)
t0 = Sys.time()
x <- as.data.table(t(t(x) * vec))
# using as.data.table because setDT does not work on matrix
(transpose_time = Sys.time() - t0)
# Time difference of 0.614 secs
## Sweep
# reset the data table
x = as.data.table(raw_matrix)
t0 = Sys.time()
setDT(x <- sweep(x, MARGIN = 2, vec, "*"))
(sweep_time = Sys.time() - t0)
# Time difference of 1.81 secs
## Make Matrix method
# reset the data table
x = as.data.table(raw_matrix)
t0 = Sys.time()
setDT(x <- x * matrix(vec, dim(x)[1], length(vec), byrow = TRUE))
(make_matrix_time = Sys.time() - t0)
# Time difference of 1.88 secs
The set
method will only work if you want to modify the original data table. If, instead, you want to keep the original and make a modified copy, then Frank's suggested method works well---it's even slightly faster than modifying the original (though it will, of course, require more memory):
## Create modified copy
z <- setDT(Map(`*`, x, vec))