1

I am trying to scatter an array of size (3,512,512,48,2), with the data type of double precision np.float64 between 3 processes using Scatter():

# mpirun -np 3 python3 prog.py
import numpy as np
from mpi4py import MPI

if __name__ == "__main__":
 comm = MPI.COMM_WORLD
 nproc = comm.Get_size()
 rank = comm.Get_rank()  
 a = None

 a_split = np.empty([512,512,48,2],dtype = np.float64)


 if rank==0:

     a = np.zeros([3,512,512,48,2],dtype = np.float64)

     print(a.shape)

 comm.Barrier()

 print('Scattering')


 comm.Scatter([a, MPI.DOUBLE], a_split, root = 0)

However, program gets a deadlock. From what I have found from here

mpi4py scatter and gather with large numpy arrays

and here

Along what axis does mpi4py Scatterv function split a numpy array?

for big arrays I must use Scatterv() function. So, here is another code using this function:

# mpirun -np 3 python3 prog.py
import numpy as np
from mpi4py import MPI

if __name__ == "__main__":
    comm = MPI.COMM_WORLD
    nproc = comm.Get_size()
    rank = comm.Get_rank()  
    a = None

    a_split = np.empty([512,512,48,2],dtype = np.float64)

    size = 512*512*48*2 

    if rank==0:

        a = np.zeros([3,512,512,48,2],dtype = np.float64)

        print(a.shape)

    comm.Barrier()

    print('Scattering')

    comm.Scatterv([a,(size,size,size),(0,size,2*size),MPI.DOUBLE],a_split,root =0)

This, however, also leads to the deadlock. I have also tried to send arrays using point-to-point communication with Send(),Recv() but this doesn't help. It appears that deadlocking is depends only on the array size - for example, if I change size of the arrays from [512,512,48,2] to [512,10,48,2], the code works.

Can anyone please suggest what I can do in this situation?

alcauchy
  • 11
  • 3

2 Answers2

2

One issue is that you mix np.float and MPI.DOUBLE. A working script could be:

# mpirun -np 3 python3 prog.py
import numpy as np
from mpi4py import MPI
comm = MPI.COMM_WORLD
nproc = comm.Get_size()
rank = comm.Get_rank()  
a = None

a_split = np.empty([512,512,48,2],dtype = np.float)
a_split[:,:,:,:] = -666

if rank==0:
    a = np.zeros([3,512,512,48,2],dtype = np.float)
    print(a.shape)

print('Scattering')
comm.Scatter(a, a_split, root = 0)

print(a_split[1,1,1,1], a_split[-1,-1,-1,-1])

I've added the last print line to show that -np 4 will work but not fill entirely a_split ; and -np 2 fails with a truncation error. My guess is that -np 3 was intended.

If your usage of np.float and MPI.DOUBLE was on purpose, please mention it in your question and add the -np you're using to launch the program.

[Edit] Here's also a C++ version of your script, so you can see if it is also deadlocking:

// mpic++ scat.cxx && mpirun -np <asmuchasyouwant> ./a.out

#include <iostream>
#include <vector>
#include <mpi.h>

int main(int argc, char** argv)
{
  MPI_Init(&argc, &argv);

  unsigned sz = 1*512*512*48*2;
  int rank, nbproc;
  std::vector<double> a;
  std::vector<double> a_split(sz);

  MPI_Comm_rank(MPI_COMM_WORLD, &rank);
  MPI_Comm_size(MPI_COMM_WORLD, &nbproc);

  if (rank == 0) {
    a.resize(nbproc * sz);
    std::fill(a.begin(), a.end(), 2.71);
  }
  else {
    std::fill(a_split.begin(), a_split.end(), -666.666);
  }  

  MPI_Scatter(a.data(), sz, MPI_DOUBLE,
              a_split.data(), sz, MPI_DOUBLE,
              0,
              MPI_COMM_WORLD
              );


  std::cout << rank << " done " << a_split[sz-1] << std::endl;

  MPI_Finalize();
}
Demi-Lune
  • 1,868
  • 2
  • 15
  • 26
  • Thanks for the reply! The mixing of `float` and `double` was not on purpose, so I edited the question to have the correct code. Unfortunately, the code you provided is also stuck at "Scattering" for some reason, and again with the arrays of smaller size the code works correctly. – alcauchy Nov 25 '19 at 12:25
  • My `mpi4py.__version__` is `2.0.0`, and no deadlock. Are you using another version? – Demi-Lune Nov 25 '19 at 14:47
  • Also, do you have the same deadlock outside python, with a c++ scatter? – Demi-Lune Nov 25 '19 at 14:48
  • I have tried c++ code - and it works fine, without any deadlocks. My version of mpi4py is 3.0.2, so I think I will try to install 2.0.0 to check if it works. – alcauchy Nov 26 '19 at 12:19
  • I'd also check which libmpi.so is found by your python installation (are you under Linux or windows btw?) – Demi-Lune Nov 26 '19 at 13:14
  • I am using linux. Coulod you please explain how do I check which version of libmpi python uses? A google search didn't gave me an answer to this. – alcauchy Nov 28 '19 at 13:44
  • You could `ldd /usr/lib/python3/dist-packages/mpi4py/MPI.cpython-35m-x86_64-linux-gnu.so` (adapt the .so name, it may be located elsewhere on your computer). But that is not 100% sure (e.g. anaconda could mix up the path, or you may have an installation in your /home, a virtenv... ). The failproof is to (1) start python3 ; import mpi4py as MPI ; open another console ; `ps ux |grep python3` to get the process id ; `grep libmpi /proc//numa_maps` – Demi-Lune Nov 28 '19 at 15:51
  • Actually, if you have psutil installed, you can also: `import psutil ; p = psutil.Process() ; print(p.memory_maps())` to display this numa_maps info. – Demi-Lune Nov 29 '19 at 10:39
0

So, in the end, the solution was quite simple - I usually don't turn off my pc, and it seems like that's the reason why it produces deadlock after lots of computation. Simple reboot solved the problem.

alcauchy
  • 11
  • 3