2

I have written a script which I was running on a Ubuntu 14.04 LTS machine in python2.7 using mpi4py. Here is a snippet from the beginning:

from mpi4py import MPI
comm = MPI.COMM_WORLD
rank = comm.Get_rank()
size = comm.Get_size()
print comm.Get_size()

On my old computer if I then run mpiexec -n 3 python2.7 foo.py I get the answer:

3
3
3

I have recently started migrating my software to a new Ubuntu 14.04 LTS server. When I run the same command there I get the answer:

1
1
1

Clearly something is going wrong here though I am not sure where to look as my MPI knowledge is insufficient. I have tried to check the MPI version and running mpiexec --version on the old computer returns:

HYDRA build details:
    Version:                                 1.4.1p1
    Release Date:                            Thu Sep  1 13:53:02 CDT 2011
    CC:                              gcc
    CXX:                             c++
    F77:                             gfortran
    F90:                             f95
    Configure options:                       '--enable-shared' '--prefix=/opt/anaconda1anaconda2anaconda3' '--disable-option-checking' 'CC=gcc' 'CFLAGS= -O2' 'LDFLAGS= ' 'LIBS=-lrt -lpthread ' 'CPPFLAGS= -I/home/ilan/aroot/work/mpich2-1.4.1p1/src/mpl/include -I/home/ilan/aroot/work/mpich2-1.4.1p1/src/mpl/include -I/home/ilan/aroot/work/mpich2-1.4.1p1/src/openpa/src -I/home/ilan/aroot/work/mpich2-1.4.1p1/src/openpa/src -I/home/ilan/aroot/work/mpich2-1.4.1p1/src/mpid/ch3/include -I/home/ilan/aroot/work/mpich2-1.4.1p1/src/mpid/ch3/include -I/home/ilan/aroot/work/mpich2-1.4.1p1/src/mpid/common/datatype -I/home/ilan/aroot/work/mpich2-1.4.1p1/src/mpid/common/datatype -I/home/ilan/aroot/work/mpich2-1.4.1p1/src/mpid/common/locks -I/home/ilan/aroot/work/mpich2-1.4.1p1/src/mpid/common/locks -I/home/ilan/aroot/work/mpich2-1.4.1p1/src/mpid/ch3/channels/nemesis/include -I/home/ilan/aroot/work/mpich2-1.4.1p1/src/mpid/ch3/channels/nemesis/include -I/home/ilan/aroot/work/mpich2-1.4.1p1/src/mpid/ch3/channels/nemesis/nemesis/include -I/home/ilan/aroot/work/mpich2-1.4.1p1/src/mpid/ch3/channels/nemesis/nemesis/include -I/home/ilan/aroot/work/mpich2-1.4.1p1/src/mpid/ch3/channels/nemesis/nemesis/utils/monitor -I/home/ilan/aroot/work/mpich2-1.4.1p1/src/mpid/ch3/channels/nemesis/nemesis/utils/monitor -I/home/ilan/aroot/work/mpich2-1.4.1p1/src/util/wrappers -I/home/ilan/aroot/work/mpich2-1.4.1p1/src/util/wrappers'
    Process Manager:                         pmi
    Launchers available:                      ssh rsh fork slurm ll lsf sge manual persist
    Topology libraries available:              hwloc plpa
    Resource management kernels available:    user slurm ll lsf sge pbs
    Checkpointing libraries available:
    Demux engines available:                  poll select

If I run it on the new computer I get the answer:

mpiexec (OpenRTE) 1.6.5

Report bugs to http://www.open-mpi.org/community/help/

Am I running different MPI implementations here that could cause the problem? How would I tell that? Or is the problem on the python end? It seems like three processes are being started just python hasn't quite realised. I realise the latter might be caused by mpi4py and mpiexec using different MPI implementations.

If I run which mpiexec on either machine it returns:

/home/pmj27/anaconda2/bin/mpiexec

Running mpi4py.get_config() returns:

{'mpicxx': '/home/pmj27/anaconda2/bin/mpicxx', 'mpif77': '/home/pmj27/anaconda2/bin/mpif77', 'mpicc': '/home/pmj27/anaconda2/bin/mpicc', 'mpif90': '/home/pmj27/anaconda2/bin/mpif90'}
P-M
  • 1,279
  • 2
  • 21
  • 35

0 Answers0