6

How do I find the number of cores available to MPI(4PY)?


Motivation

My Python program spawns MPI instances hierarchically.

The first spawn always happens and creates 4 instances. It doesn't make sense to increase this number due to the structure of my computations, so I hardcoded it.

Depending on the command line options to the main program, each of the 4 instances then call external Python software that scales almost linearly.

I call this external software using

N=3
child=MPI.COMM_SELF.Spawn(sys.executable,args=[`external.py`],maxprocs=N)

At the moment, I use N=3 so that the 4 instances of the first spawn each spawn 3 instances of the external program, which yields a total of 12 instances, matching the number of cores on my workstation.

However, for portability, I would like to do

N_avail = <MPI.N_CORES> #on my workstation: N_avail=12
N = N_avail/MPI.COMM_WORLD.Get_size() #on my workstation: N=12/4=3

so that the number of available cores needn't be hardcoded.

Is this possible, and does it make sense?


Notes

I had hoped that not specifying maxprocs would do the job, just as mpirun with out -np spawns as many instances as available cores. However, Spawn then defaults to maxprocs=1.

The call of the external library is blocking, which is why I don't (wouldn't) subtract the 4 instances from the first spawn from N_avail.

I can't just use multiprocessing.cpu_count(), since this would only give me the cores on the current node (in a cluster setting). I am planning to run my code on a cluster using a SLURM scheduler.

Bananach
  • 2,016
  • 26
  • 51

1 Answers1

2

There is an attribute of the world communicator that might provide the total number of process expected: MPI_UNIVERSE_SIZE. See the MPI standards, http://mpi-forum.org/docs/mpi-3.1/mpi31-report/node253.htm#Node253

MPI provides an attribute on MPI_COMM_WORLD, MPI_UNIVERSE_SIZE, that allows the application to obtain this information in a portable manner. This attribute indicates the total number of processes that are expected. ... An application typically subtracts the size of MPI_COMM_WORLD from MPI_UNIVERSE_SIZE to find out how many processes it should spawn. ...

In mpi4py, it can be printed as:

from mpi4py import MPI

version= MPI.Get_version()
print "mpi version is ",version

comm = MPI.COMM_WORLD
rank = comm.Get_rank()
size = comm.Get_size()
print "size is ",size

universe_size=comm.Get_attr(MPI.UNIVERSE_SIZE)
print "universe size is ",universe_size

Following OpenMPI mpirun universe size , this feature can be tested by trying:

mpirun -np 1 -H localhost,localhost,localhost python main.py

If your MPI version is higher or equal to 3, the MPI_Info MPI_INFO_ENV could help you. It features two keys which might provide some pieces of information:

maxprocs Maximum number of MPI processes to start.

soft Allowed values for number of processors.

To use it in mpi4py, you could try:

soft=MPI.INFO_ENV.get("soft")
print soft
maxprocs=MPI.INFO_ENV.get("maxprocs")
print maxprocs
Community
  • 1
  • 1
francis
  • 9,525
  • 2
  • 25
  • 41
  • First approach: This is able to reproduce the number of times I typed localhost. However, I would like to start `main.py` without mpirun and then figure out how many cores are available. Even if I start `mpirun -np 4 python main.py`, i.e. I dont use the `-H` flag, this approach always returns `1` – Bananach Nov 14 '17 at 07:34
  • Second approach: Both return the argument passed to `-np` , or `1` if run without `mpirun` – Bananach Nov 14 '17 at 07:35