2

What's the difference between using the "doParallel" package with type = MPI and using doMPI directly?

library(foreach)
library(doParallel)
cl <- makeCluster(mpi.universe.size(), type='MPI')
registerDoParallel(cl)
system.time(foreach(i = 1:3) %dopar% {Sys.sleep(i); i})

VS

library(doMPI)
cl <- startMPIcluster(count=2)
registerDoMPI(cl)
system.time(foreach(i = 1:3) %dopar% {Sys.sleep(i); i})
correocont
  • 69
  • 6

1 Answers1

2

The "doParallel" package acts as a wrapper around the "clusterApplyLB" function which is implemented by calling functions from the "Rmpi" package when using an MPI cluster.

The "doMPI" package uses "Rmpi" functions directly and includes some features that aren't available in "clusterApplyLB":

  • supports fetching inputs and combining outputs on-the-fly to efficiently handle a large number of loop iterations;

  • supports MPI broadcast to initialize workers;

  • allows workers to be started either by mpirun or MPI spawn function.

Steve Weston
  • 19,197
  • 4
  • 59
  • 75
  • So, to use MPI in High-Performance Computing, the "doMPI" package is more efficient than doParallel? – correocont May 14 '18 at 08:34
  • @correocont I benchmarked doMPI against snow/Rmpi when I was developing it with the goal of never being slower that snow/Rmpi, and to be faster for certain cases. I think I achieved that, but my guess is that many, if not most, benchmarks will be a tie. – Steve Weston May 14 '18 at 14:31