parellel R is slower than serial?

Question

I am using R parallel package to do parallel computation on my laptop:

> library(parallel)
> x = matrix(rep(1,2000), nrow=2)
> cl <- makeCluster(getOption("cl.cores", 8))
> system.time(replicate(5000, parApply(cl, x, 1, paste, collapse="-")))
   user  system elapsed 
  7.950   0.966  13.562 
> stopCluster(cl)
> system.time(replicate(5000, apply(x, 1, paste, collapse="-")))
   user  system elapsed 
  8.357   0.001   8.355

Did I make any mistake here? The only thing I am not so sure about is how to use makeCluster.

Update: to reduce the overhead cost of parallelization, use a much bigger matrix x and remove replicate in the benchmark; still, the different between parallel and serial are very marginal and sometimes parallel is slower.

parallel isn't always faster. It takes time to create a cluster and put it all back together. If the operation is a bunch of very small jobs parallel processing may be slower. — Tyler Rinker, Apr 12 '13 at 03:17
Remember [Amdahl's law](http://en.wikipedia.org/wiki/Amdahl%27s_law). — Michael Hoffman, Apr 12 '13 at 03:33
@TylerRinker: as I mentioned in the update. Parallelization here doesn't make difference even for single time-costly job. I just want to make sure what I didn't make mistake here; and if not, what factors could explain it. — RNA, Apr 12 '13 at 03:51
@RNA: your update _increases_ the cost of running in parallel because you're increasing the amount of data sent to each node. Running on multiple CPUs is only beneficial if the problem is CPU-intensive. — Joshua Ulrich, Apr 12 '13 at 11:21

parellel R is slower than serial?

0 Answers0