0

In R, I am using the package foreach with doMPI in a wrapper script run an external model many times in parallel on a cluster. Each MPI process gets one parameter point for which to execute the model.

However, to run this, there's also a bit of pre- and post-processing -- making some folders first, and aggregating the results at the end. This is also parallelisable, but not with the same number of jobs as the main model runs.

The way I've handled it is by using multiple subsequent foreach loops in the script. First one that makes the folders, then when that's ended, another to run the model. And this is where, despite consulting the documentation, I am a little green on how the doMPI package works in detail, and how MPI works more generally, I guess: Am I guaranteed that all MPI processes in loop 1 finish before any work is done in loop 2? This would be a necessity for the script logic. If not, are there any magic MPI commands I could use to enforce my desired behaviour? Does it make any sense to close and reopen the cluster, even? Or is that stupid? Like,

foreach (i1=1:N1) %dopar% {
    loopy loop number 1
}

# Stop the MPI cluster and start it again:
closeCluster(cl) 
cl = startMPIcluster()
registerDoMPI(cl)

foreach (i2=1:N2) %dopar% {
    loopy loop number 2
}

Thanks!

  • 1
    "Am I guaranteed that all MPI processes in loop 1 finish before any work is done in loop 2?" Yes. (I realize this is pseudo-code but you are not assigning the return values of the `foreach` loops. You should do so. Not doing that hints at you not understanding `foreach`, in particular regarding the fact that there can't be side effects of the loop body, except writing to files.) – Roland May 16 '22 at 08:00
  • Thanks for your reply! Great to hear that the expected behaviour is guaranteed. I do in fact know that foreach returns values, and in my script they are assigned -- sloppy of me to not include that here. What do you mean by "there can't be side effects of the loop body"? Cheers – Jørgen Eriksson Midtbø May 16 '22 at 10:16
  • You can't plot (except to a file device), you can't assign into the global environment, ... – Roland May 16 '22 at 10:59

0 Answers0