1

I've used the combination of R, doMPI, and foreach on a cluster for several years now and usually increasing simulation iteration numbers is pretty linear, in terms of execution time required. Recently I've been using this nested foreach loop, and as I increase the number of simulations (NumSim) the speed slows down dramatically, and I have no idea why. Any thoughts on how to diagnose, or where to start looking?

For example, as a testing example, with 10 cores and everything else held the same, if

NumSim = 10, time to complete is 678 seconds

NumSim = 20, time = 1856 seconds

NumSim = 30, time = 3560 seconds

NumSim = 50, time = 7956 seconds

With previous work I would have expected NumSim =50 to take almost exactly 678 * 5 ~ 3390 seconds.

results <- foreach (j = 1:NumSim, .combine = acomb) %:%
    ## Person Single Population
    foreach (i = 1:PopSize, .combine=rbind, .packages = c("zoo")) %dopar% {
    annual <- AnnualProbInf(WatCons, CrpPerLit, 1, 1, naf)
    daily <- AnnualProbInf(WatCons, CrpPerLit, 365, 365, khf)
    immune <- AnnualProbInfImm(WatCons, CrpPerLit, 730, 730, khf, DayNonSus)
    out <- cbind (annual, daily, immune)
    }
Steve Weston
  • 19,197
  • 4
  • 59
  • 75
Jim Maas
  • 1,481
  • 2
  • 16
  • 36
  • i've been having a similar problem, so ran a few tests. Got some help but not a solution to the problem yet. https://stackoverflow.com/questions/41925706/why-does-foreach-dopar-get-slower-with-each-additional-node – JustGettinStarted Feb 09 '17 at 23:39
  • although my specific example uses doParallel, i have recently switched to MPI and same problem – JustGettinStarted Feb 09 '17 at 23:39
  • What is the value of PopSize? Are you using 10 workers or 10 workers per node? Does AnnualProbInf return a single numeric value or a vector? – Steve Weston Feb 12 '17 at 16:16

0 Answers0