R's parallel Foreach slow to start with a high iterator value. What can be done?

Question

On a windows platform it seems that a high iterator value will stop foreach in its tracks. For example

library(doParallel)
library(foreach)
cl<-makeCluster(7) 
registerDoParallel(cl)
bayes_results<-foreach (n=1:100) %dopar% {n}

This works just fine in a few seconds.

However increasing the value to a few million stops foreach from working even after a several hour wait.

bayes_results<-foreach (n=1:5000000) %dopar% {n}

What is the problem? How could it be solved?

Thank you.

You should parallelize CPU intensive operations. Otherwise, parallelization overhead removes any advantage from parallelization. Furthermore, this is still an R loop with many iterations. At best parallelization can reduce computation time by about a factor 7 for you. Using vectorized code instead of R loops can usually achieve orders of magnitude. — Roland, Dec 25 '14 at 15:22
This isn't the same question as https://stackoverflow.com/questions/14614306. That one is concerned generally with why foreach can be slower. This one is about the specific problem that when the number of items to loop over is large, foreach can take a very long time initializing before it ever starts the parallel processing. — Andrew Schulman, Apr 15 '19 at 10:11

score 3 · Answer 1 · answered Dec 26 '14 at 16:44

You usually don't want the number of tasks to be orders of magnitude larger than the number of workers unless perhaps the individual tasks take hours to run. The itertools package has functions that can help control the number of tasks. For example, you can use the isplitVector function to split the input vector so there is exactly one task per worker:

library(itertools)
r <- foreach(nv=isplitVector(1:5000000, chunks=cores)) %dopar% { nv }

Of course, you'll usually have to modify the body of the loop, but now the loop runs in less than a second on my machine, as you would expect.

If the tasks take very different lengths of time to execute, you can enable better load balancing by splitting the problem into more chunks:

r <- foreach(nv=isplitVector(1:5000000, chunks=10*cores)) %dopar% { nv }

Splitting up the work like this may also allow you to parallelize the combining of results which can also improve performance.

R's parallel Foreach slow to start with a high iterator value. What can be done?

1 Answers1