4

I updated open-mpi to 3.0.0, reloaded Rmpi and doMPI, and now getting this error when executing startCluster on Ubuntu Linux, R 3.4.2.

Error in mpi.comm.spawn(slave = rscript, slavearg = args, nslaves = count,  : 
  MPI_ERR_SPAWN: could not spawn processes

How to diagnose problem?

Jim Maas
  • 1,481
  • 2
  • 16
  • 36
  • Start with simple MPI code. Make sure you can compile and run simple MPI Hello World app. Maybe your OpenMPI installation is "broken". – Oo.oO Oct 03 '17 at 14:50
  • Thanks mko. This is all new to me but this works `mpirun -np 6 mpi_hello_world Hello world from processor JAM-Home-PC, rank 1 out of 6 processors Hello world from processor JAM-Home-PC, rank 5 out of 6 processors Hello world from processor JAM-Home-PC, rank 2 out of 6 processors ... but this does not jamaas:code$ mpirun -np 7 mpi_hello_world There are not enough slots available in the system to satisfy the 7 slots ...: mpi_hello_world Either request fewer slots for your application, or make more slots available for use.` – Jim Maas Oct 03 '17 at 17:40
  • Take a look here: https://stackoverflow.com/questions/35704637/mpirun-not-enough-slots-available – Oo.oO Oct 03 '17 at 17:47

1 Answers1

2

To test your MPI installation, do following:

/* Put this text inside hello.c file */
#include <mpi.h>
#include <stdio.h>

int main(int argc, char** argv) {
    int rank;
    int world;

    MPI_Init(NULL, NULL);
    MPI_Comm_rank(MPI_COMM_WORLD, &rank);
    MPI_Comm_size(MPI_COMM_WORLD, &world);
    printf("Hello: rank %d, world: %d\n",rank, world);
    MPI_Finalize();
}

Then, compile it

mpicc -o hello ./hello.c

and then, try to run it

mpirun -np 2 ./hello

If you get

Hello: rank 0, world: 2
Hello: rank 1, world: 2

It means your MPI installation is fine and you have to look inside R, otherwise it means that MPI is not correctly configured and there are low chances to get any further.

Update

It looks like R3.4 + OpenMPI 3.0.0 + Rmpi missbehave ;)

If you try run slaves outside R, it works. So, I guess there is some issue inside native code of Rmpi.

> cp -r /Library/Frameworks/R.framework/Versions/3.4/Resources/library/Rmpi ~
> cd ~/Rmpi
> mpirun -np 2 ./Rslaves.sh `pwd`/slavedaemon.R tmp needlog /Library/Frameworks/R.framework/Versions/3.4/Resources/
# if you put 
# localhost slots=25
# inside ~/.hostfile, you can acquire more resources
> mpirun --hostfile=~/.hostfile -np 4 ./Rslaves.sh `pwd`/slavedaemon.R tmp needlog /Library/Frameworks/R.framework/Versions/3.4/Resources/

Update with proper fix for R 3.4 and OpenMPI 3.0.0

Create file: ~/.openmpi/mca-params.conf and put inside

orte_default_hostfile=YOUR_USER_HOME/default_host

Create file: ~/default_host with content

localhost slots=25

Run R, load RMpi and run code

> library(Rmpi)
> mpi.spawn.Rslaves()
    4 slaves are spawned successfully. 0 failed.
master (rank 0, comm 1) of size 5 is running on: pi
slave1 (rank 1, comm 1) of size 5 is running on: pi
slave2 (rank 2, comm 1) of size 5 is running on: pi
slave3 (rank 3, comm 1) of size 5 is running on: pi
slave4 (rank 4, comm 1) of size 5 is running on: pi

For full story, take a look here: R3.4 + OpenMPI 3.0.0 + Rmpi inside macOS - little bit of mess ;)

Oo.oO
  • 12,464
  • 3
  • 23
  • 45