The existing mapply
approach among all answers look great but I believe we can achieve more efficiency if we use Map
+ list2DF
instead (specially when you prefer to stay with base R)
Below is a benchmark for mapply
and Map
variants
microbenchmark(
"mapply1" = data.frame(mapply(FUN = `*`, df, pw2)),
"mapply2" = as.data.frame(mapply(FUN = `*`, df, pw2)),
"Map1" = list2DF(Map(`*`, df, pw2)),
"Map2" = list2DF(Map(`*`, df, as.list(pw2)))
)
gives
Unit: microseconds
expr min lq mean median uq max neval
mapply1 74.6 78.60 112.163 97.05 140.50 342.6 100
mapply2 34.6 38.20 55.513 42.70 67.40 313.5 100
Map1 23.8 25.25 33.728 27.60 41.30 113.8 100
Map2 25.9 28.75 40.866 32.95 47.65 238.6 100
Also, let the Map
approach join the benchmarking party as provided by @Maël, e.g.,
bc <- bench::mark(
sweep = sweep(df, 2, pw2, `*`),
col = df * pw2[col(df)],
"%*%" = setNames(
as.data.frame(as.matrix(df) %*% diag(pw2)),
names(df)
),
TRA = collapse::TRA(df, pw2, "*"),
mapply1 = data.frame(mapply(FUN = `*`, df, pw2)),
mapply2 = as.data.frame(mapply(FUN = `*`, df, pw2)),
Map1 = list2DF(Map(`*`, df, pw2)),
Map2 = list2DF(Map(`*`, df, as.list(pw2))),
apply = t(apply(df, 1, \(x) x * pw2)),
t = t(t(df) * pw2),
check = FALSE,
)
we will see that Map
is in the second place in terms of efficiency
# A tibble: 10 × 13
expression min median `itr/sec` mem_alloc `gc/sec` n_itr n_gc
<bch:expr> <bch:tm> <bch:tm> <dbl> <bch:byt> <dbl> <int> <dbl>
1 sweep 201.7µs 249.2µs 3526. 101.24KB 12.6 1680 6
2 col 174.9µs 225.6µs 3637. 9.02KB 10.4 1748 5
3 %*% 45.4µs 52.9µs 17026. 36.95KB 12.5 8158 6
4 TRA 3.4µs 3.8µs 226089. 905.09KB 22.6 9999 1
5 mapply1 71.6µs 78.4µs 11958. 480B 14.7 5681 7
6 mapply2 33.1µs 37.4µs 25339. 480B 17.7 9993 7
7 Map1 22.5µs 26.1µs 35649. 0B 17.8 9995 5
8 Map2 25.3µs 29.4µs 31785. 0B 19.1 9994 6
9 apply 70.2µs 80.7µs 11684. 11.91KB 14.7 5562 7
10 t 34.8µs 40.2µs 23608. 3.77KB 14.2 9994 6
# ℹ 5 more variables: total_time <bch:tm>, result <list>, memory <list>,
# time <list>, gc <list>
and autoplot(bc)
shows
