Most of the points have been made before, but...
sapply()
uses lapply()
and then pays a one-time cost of formatting the result using simplify2array()
.
lapply()
creates a long vector, and then a large number of short (length 1) vectors, whereas the for loop generates a single long vector.
The sapply()
as written has an extra function call compared to the for loop.
Using gcinfo(TRUE)
lets us see the garbage collector in action, and each approach results in the garbage collector running several times -- this can be quite expensive, and not completely deterministic.
Points 1 - 3 need to be interpreted in the artificial context of the example -- exp()
is a fast function, exaggerating the relative contribution of memory allocation (2), function evaluation (3), and one-time costs (1). Point 4 emphasizes the need to replicate timings in a systematic way.
I started by loading the compiler and microbenchmark packages. I focused on the largest size only
library(compiler)
library(microbenchmark)
n <- 10^7
In my first experiment I replaced exp()
with simple assignment, and tried different ways of representing the result in the for loop -- a vector of numeric values, or list of numeric vectors as implied by lapply()
.
fun0n <- function(n) {
Y1 <- numeric(n)
for (j in seq_len(n)) Y1[j] <- 1
}
fun0nc <- compiler::cmpfun(fun0n)
fun0l <- function(n) {
Y1 <- vector("list", n)
for (j in seq_len(n)) Y1[[j]] <- 1
}
fun0lc <- compiler::cmpfun(fun0l)
microbenchmark(fun0n(n), fun0nc(n), fun0lc(n), times=5)
## Unit: seconds
## expr min lq mean median uq max neval
## fun0n(n) 5.620521 6.350068 6.487850 6.366029 6.933915 7.168717 5
## fun0nc(n) 1.852048 1.974962 2.028174 1.984000 2.035380 2.294481 5
## fun0lc(n) 1.644120 2.706605 2.743017 2.998258 3.178751 3.187349 5
So it pays to compile the for loop, and there's a fairly substantial cost to generating a list of vectors. Again this memory cost is amplified by the simplicity of the body of the for loop.
My next experiment explored different *apply()
fun2s <- function(n)
sapply(raw(n), function(i) 1)
fun2l <- function(n)
lapply(raw(n), function(i) 1)
fun2v <- function(n)
vapply(raw(n), function(i) 1, numeric(1))
microbenchmark(fun2s(n), fun2l(n), fun2v(n), times=5)
## Unit: seconds
## expr min lq mean median uq max neval
## fun2s(n) 4.847188 4.946076 5.625657 5.863453 6.130287 6.341282 5
## fun2l(n) 1.718875 1.912467 2.024325 2.141173 2.142004 2.207105 5
## fun2v(n) 1.722470 1.829779 1.847945 1.836187 1.845979 2.005312 5
There is a large cost to the simplification step in sapply()
; vapply()
is more robust than lapply()
(I am guaranteed the type of the return) without performance penalty, so it should be my go-to function in this family.
Finally, I compared the for iteration to vapply()
where the result is a list-of-vectors.
fun1 <- function(n) {
Y1 <- vector("list", n)
for (j in seq_len(n)) Y1[[j]] <- exp(0)
}
fun1c <- compiler::cmpfun(fun1)
fun3 <- function(n)
vapply(numeric(n), exp, numeric(1))
fun3fun <- function(n)
vapply(numeric(n), function(i) exp(i), numeric(1))
microbenchmark(fun1c(n), fun3(n), fun3fun(n), times=5)
## Unit: seconds
## expr min lq mean median uq max neval
## fun1c(n) 2.265282 2.391373 2.610186 2.438147 2.450145 3.505986 5
## fun3(n) 2.303728 2.324519 2.646558 2.380424 2.384169 3.839950 5
## fun3fun(n) 4.782477 4.832025 5.165543 4.893481 4.973234 6.346498 5
microbenchmark(fun1c(10^3), fun1c(10^4), fun1c(10^5),
fun3(10^3), fun3(10^4), fun3(10^5),
times=50)
## Unit: microseconds
## expr min lq mean median uq max neval
## fun1c(10^3) 199 215 230 228 241 279 50
## fun1c(10^4) 1956 2016 2226 2296 2342 2693 50
## fun1c(10^5) 19565 20262 21671 20938 23410 24116 50
## fun3(10^3) 227 244 254 254 264 295 50
## fun3(10^4) 2165 2256 2359 2348 2444 2695 50
## fun3(10^5) 22069 22796 23503 23251 24393 25735 50
The compiled for loop and vapply()
are neck-in-neck; the extra function call almost doubles the execution time of vapply()
(again, this effect is exaggerated by the simplicity of the example). There does not seem to be much change in relative speed across a range of sizes