Which one is faster and why? If there is any - maybe it depends of the data and functions we are using. If so, how? I've checked on some examples:
lista <- list(a=1:100, b=-20:500,c=300:1000,rep(1000,1000))
for(i in 1:10){ lista <- c(lista,lista)} # length==4096
To compare I wrote this function
loopfor <- function(x, fun){
ret <- vector("list",length(x))
for (i in seq_along(x)) {
ret[[i]] <- fun(x[[i]])
}
return(ret)
}
lapplyfun <- function(x, fun){
ret <- lapply(x, fun)
return(ret)
}
loopfor vs lapplyfun call
For sum
function lapply is the winner
require(microbenchmark)
microbenchmark(loopfor(lista,sum), lapplyfun(lista,sum),times=100)
Unit: milliseconds
expr min lq median uq max neval
loopfor(lista, sum) 20.496391 21.058436 21.423077 22.309260 50.80541 100
lapplyfun(lista, sum) 8.745445 9.007782 9.342844 9.777506 15.15932 100
but for more complex function like summary
the difference is really small
microbenchmark(loopfor(lista,summary), lapplyfun(lista,summary),times=10)
Unit: seconds
expr min lq median uq max neval
loopfor(lista, summary) 2.147071 2.164275 2.186433 2.228169 2.342094 10
lapplyfun(lista, summary) 2.024157 2.099712 2.198469 2.314902 2.550751 10
Any explanation, ideas? Maybe loopfor
should be written differently to increase performance? :)