I have written a script which I was running on a Ubuntu 14.04 LTS machine in python2.7 using mpi4py. Here is a snippet from the beginning:
from mpi4py import MPI
comm = MPI.COMM_WORLD
rank = comm.Get_rank()
size = comm.Get_size()
print comm.Get_size()
On my old computer if I then run mpiexec -n 3 python2.7 foo.py
I get the answer:
3
3
3
I have recently started migrating my software to a new Ubuntu 14.04 LTS server. When I run the same command there I get the answer:
1
1
1
Clearly something is going wrong here though I am not sure where to look as my MPI knowledge is insufficient. I have tried to check the MPI version and running mpiexec --version
on the old computer returns:
HYDRA build details:
Version: 1.4.1p1
Release Date: Thu Sep 1 13:53:02 CDT 2011
CC: gcc
CXX: c++
F77: gfortran
F90: f95
Configure options: '--enable-shared' '--prefix=/opt/anaconda1anaconda2anaconda3' '--disable-option-checking' 'CC=gcc' 'CFLAGS= -O2' 'LDFLAGS= ' 'LIBS=-lrt -lpthread ' 'CPPFLAGS= -I/home/ilan/aroot/work/mpich2-1.4.1p1/src/mpl/include -I/home/ilan/aroot/work/mpich2-1.4.1p1/src/mpl/include -I/home/ilan/aroot/work/mpich2-1.4.1p1/src/openpa/src -I/home/ilan/aroot/work/mpich2-1.4.1p1/src/openpa/src -I/home/ilan/aroot/work/mpich2-1.4.1p1/src/mpid/ch3/include -I/home/ilan/aroot/work/mpich2-1.4.1p1/src/mpid/ch3/include -I/home/ilan/aroot/work/mpich2-1.4.1p1/src/mpid/common/datatype -I/home/ilan/aroot/work/mpich2-1.4.1p1/src/mpid/common/datatype -I/home/ilan/aroot/work/mpich2-1.4.1p1/src/mpid/common/locks -I/home/ilan/aroot/work/mpich2-1.4.1p1/src/mpid/common/locks -I/home/ilan/aroot/work/mpich2-1.4.1p1/src/mpid/ch3/channels/nemesis/include -I/home/ilan/aroot/work/mpich2-1.4.1p1/src/mpid/ch3/channels/nemesis/include -I/home/ilan/aroot/work/mpich2-1.4.1p1/src/mpid/ch3/channels/nemesis/nemesis/include -I/home/ilan/aroot/work/mpich2-1.4.1p1/src/mpid/ch3/channels/nemesis/nemesis/include -I/home/ilan/aroot/work/mpich2-1.4.1p1/src/mpid/ch3/channels/nemesis/nemesis/utils/monitor -I/home/ilan/aroot/work/mpich2-1.4.1p1/src/mpid/ch3/channels/nemesis/nemesis/utils/monitor -I/home/ilan/aroot/work/mpich2-1.4.1p1/src/util/wrappers -I/home/ilan/aroot/work/mpich2-1.4.1p1/src/util/wrappers'
Process Manager: pmi
Launchers available: ssh rsh fork slurm ll lsf sge manual persist
Topology libraries available: hwloc plpa
Resource management kernels available: user slurm ll lsf sge pbs
Checkpointing libraries available:
Demux engines available: poll select
If I run it on the new computer I get the answer:
mpiexec (OpenRTE) 1.6.5
Report bugs to http://www.open-mpi.org/community/help/
Am I running different MPI implementations here that could cause the problem? How would I tell that? Or is the problem on the python end? It seems like three processes are being started just python hasn't quite realised. I realise the latter might be caused by mpi4py and mpiexec using different MPI implementations.
If I run which mpiexec
on either machine it returns:
/home/pmj27/anaconda2/bin/mpiexec
Running mpi4py.get_config()
returns:
{'mpicxx': '/home/pmj27/anaconda2/bin/mpicxx', 'mpif77': '/home/pmj27/anaconda2/bin/mpif77', 'mpicc': '/home/pmj27/anaconda2/bin/mpicc', 'mpif90': '/home/pmj27/anaconda2/bin/mpif90'}