0

When i apply this code in R, the loop and sapply are faster than snowfall's functions. What am i doing wrong? (using windows 8)

library(snowfall)
a<- 2
sfInit(parallel = TRUE, cpus = 4)
wrapper <- function(x){((x*a)^2)/3}
sfExport('a')
values <- seq(0, 100,1)
benchmark(for(i in 1:length(values)){wrapper(i)},sapply(values,wrapper),sfLapply(values, wrapper),sfClusterApplyLB(values, wrapper))
sfStop()

elapsed time for after 100 replications:

loop              0.05
sapply            0.07
sfClusterApplySB  2.94
sfApply           0.26
Daniel Gimenez
  • 18,530
  • 3
  • 50
  • 70

1 Answers1

1

If the function that is sent to each of the worker nodes takes a small amount of time, the overhead of paralellization causes the overall duration of the task to take longer than running the job serially. When the jobs that are sent to the worker nodes take a significant amount of time (at least several seconds), than paralellization will really show improved performance.

See also:

Searching for [r] parallel will yield at least 20 questions like yours, including more details as to what you can do to solve the problem.

Community
  • 1
  • 1
Paul Hiemstra
  • 59,984
  • 12
  • 142
  • 149