1

I was used to do parallel computation with doMC and foreach and I have now access to a cluster. My problem is similar to this one Going from multi-core to multi-node in R but there is no response on this post.

Basically I can request a number of tasks -n and a number of cores per task -c to my batch queuing system. I do manage to use doMPI to make parallel simulations on the number of tasks I request, but I now want to use the maxcores options of startMPIcluster to make each MPI process use multicore functionality.

Something I have notices is that parallel::detectCores() does not seem to see how many cores I have been attributed and return the maximum number of core of a node.

For now I have tried:

ncore = 3 #same number as the one I put with -c option
library(Rmpi)
library(doMPI)
cl <- startMPIcluster(maxcores = ncore)
registerDoMPI(cl)
## now some parallel simulations
foreach(icount(10), .packages = c('foreach', 'iterators', 'doParallel')) %dopar% {
    ## here I'd like to use the `ncore` cores on each simulation of `myfun()`
    registerDoParallel(cores = ncore)
    myfun()
}

(myfun has indeed some foreach loop inside) but if I set ncore > 1 then I got an error:

Error in { : task 1 failed - "'mckill' failed"

thanks

EDIT

the machineI have access to is http://www-ccrt.cea.fr/fr/moyen_de_calcul/airain.htm, where it's specified "Librairies MPI: BullxMPI, distribution Bull MPI optimised and compatible with OpenMPI"

Community
  • 1
  • 1
ClementWalter
  • 4,814
  • 1
  • 32
  • 54
  • I'm guessing you're using Slurm. Are you submitting the job via sbatch? How are you executing the R script? Are you using mpirun or srun? – Steve Weston Feb 18 '16 at 20:13
  • Yes I've seen this word (slurm) somewhere (I'm definitely not an expert in HPC). I'm executing the script with something like "ccc_mprun -n 10 -c 5 R CMD BATCH --vanilla myscript.R". I was using mpirun first but then one told me to out this ccc_mprun instead. I've also tried in interactive mode but indeed even without this "double" parallelisation I don't manage to use the resources supplied with -c – ClementWalter Feb 19 '16 at 00:14
  • Also I've seen your comment on another related post about Rmpi::mpi.universe.size() which in my case only return the value given in -n and not the product -n*-c – ClementWalter Feb 19 '16 at 00:18
  • well my probleme is indeed a pure `doMC` problem. while I have already done it an multicore computers (not cluster) I don't understand why it doesn't work here. I've seen some related questions on why on some Linux distributions R does not run in parallel; I'll try to dig a bit in this way and close this question. – ClementWalter Feb 19 '16 at 14:09
  • It's possible that your MPI implementation doesn't like the workers to fork processes, which is done by the mclapply function used by doMC. It seems odd that you're getting an error when mclapply tries to kill the forked child processes. I've never seen that. – Steve Weston Feb 19 '16 at 14:21
  • well I've moved a bit forward and after some other test cases it appears that PSOCK clusters as well as forking work, even though R does not see how many cores I've asked with my batch submission script. It appears that the problem comes from `myfun()` which runs well in parallel with an MPI background but not with a SNOW or MC one. Is it possible that some calls to `local` of `<<-` cause some troubles ? – ClementWalter Feb 22 '16 at 13:23

1 Answers1

2

You are trying to use a lot of different concepts at the same time. You are using an MPI-based cluster to launch on different computers, but are trying to use multi-core processing at the same time. This makes things needlessly complicated.

The cluster you are using is probably spread out over multiple nodes. You need some way to transfer data between these nodes if you want to do parallel processing.

In comes MPI. This is a way to easily connect between different workers on different machines, without needing to specify IP addresses or ports. And this is indeed why you want to launch your process using mpirun or ccc_mprun (which is probably a script with some extra arguments for your specific cluster).

How do we now use this system in R? (see also https://cran.r-project.org/web/packages/doMPI/vignettes/doMPI.pdf)

Launch your script using: mpirun -n 24 R --slave -f myScriptMPI.R, for launching on 24 worker processes. The cluster management system will decide where to launch these worker processes. It might launch all 24 of them on the same (powerful) machine, or it might spread them over 24 different machines. This depends on things like workload, available resources, available memory, machines currently in SLEEP mode, etc.

The above command will launch myScriptMPI.R on 24 different machines. How do we now collaborate?

library(doMPI)
cl <- startMPIcluster() 
#the "master" process will continue
#the "worker" processes will wait here until receiving work from the master
registerDoMPI(cl)
## now some parallel simulations
foreach(icount(24), .packages = c('foreach', 'iterators'), .export='myfun') %dopar% {
    myfun()
}

Your data will get transferred automatically from master to workers using the MPI protocol.

If you want more control over your allocation, including making "nested" MPI clusters for multicore vs. inter-node parallelization, I suggest you read the doMPI vignette.

parasietje
  • 1,529
  • 8
  • 36
  • thanks for you answer but but my point was precisely to do use multicore on each node. Say I want to make some simulations, and for each simulation I want to use parallel computing. I thought of using a number of nodes equal to my number of simulations, and then performing each simulation with parallel multicore – ClementWalter Apr 14 '16 at 09:57
  • In any case, cluster management systems like SLURM will never allow you to "go behind their back" and take more CPU's than your process was assigned. Use doMPI for everything, do not mix-and-match parallel processing libraries. The solution is to launch your MPI process with requirements (e.g. on 4 hosts with 4 CPU's per host) and create sub-steps using nested clusters. The doMPI vignette explains this rather well. – parasietje Apr 15 '16 at 06:37