I'm writing a function that needs to call a function g
passed as a parameter to each element of a list, iteratively.
I'm wondering how to make this the fastest possible. I can achieve an acceptable speed using Rcpp
and specific kind of g
(writing everything in Cpp), but I can't figure out if I can reach similar speed passing an R function as argument.
Was doing some tests to figure out why R is slower and found some really unexpected results:
minus <- function(x) -x
minus_vec <- Vectorize(minus, "x")
Testing with some simple functions to invert signs.
f0 <- function(x) {
sapply(x, minus)
}
f1 <- function(x) {
for(i in seq_along(x)){
x[i] <- -x[i]
}
x
}
f2 <- function(x) {
for(i in seq_along(x)){
x[i] <- minus(x[i])
}
x
}
I got the following results:
a <- 1:10^5
library(rbenchmark)
benchmark(f0(a), f1(a), f2(a), minus_vec(a), minus(a))[,c(1,4)]
test relative
1 f0(a) 454.842
2 f1(a) 25.579
3 f2(a) 178.211
4 minus_vec(a) 523.789
5 minus(a) 1.000
I would like some explanation on the following points:
Why don't
f1
andf2
have the same speed? Writing the piece of code-x[i]
and calling the functionminus(x[i])
really should be so different when they do the exact same thing?Why is
f0
slower thanf2
? I always thoughtapply
functions were more efficient thanfor
loops, but never really understood why and now I even found a counter-example.Can I make a function as fast as
f1
using the functionminus
?Why does vectorizing
minus
(unnecessary since-
is already vectorized, but that might not be the case always) made it so bad?