I run the script below using mpirun -n 40 python script.py
. The aim is to parallelize the function func. What happens here is that the pool of 40 "workers" is split into 5 blocks of 8 "workers" (Each block with its own color ofcourse).
I generate arguments (gen_args) for each block and flatten these to a 1D numpy array. Then I use ScatterV, to scatter this flattened array to the "workers" in one block. The scattered value is caught in the variable recv_args.
Everything works well (based on the print outputs I've seen), except for the block_comm.Scatterv([send_data,counts,displacement, MPI.DOUBLE], recv_args, root=0)
Somehow it sets all the array elements of recv_args (in all ranks) to zero (Keep in mind, they are ones initially in all ranks)
What am I missing here?
Here is the MCVE:
import numpy as np
from mpi4py import MPI
def func(arg1, arg2,arg3):
return arg1+arg2+arg3
def gen_args(const, n_iter): #More complicated ofcourse in reality
return const*np.arange(n_iter*3).reshape((n_iter,3))
if __name__ == '__main__':
world_comm = MPI.COMM_WORLD
world_size = world_comm.Get_size() # normally 40
world_rank = world_comm.Get_rank()
block_size = 8
blocks = int(world_size/block_size)
color = int(world_rank/block_size)
key = int(world_rank%block_size)
block_comm = world_comm.Split(color,key)
#Effectively world_comm (size=40) is now split in 5 blocks of size=8
block_rank = block_comm.Get_rank()
print("WORLD RANK/SIZE: {}/{} \t BLOCK RANK/SIZE: {}/{}".format(world_rank, world_size, block_rank, block_size))
recv_args= np.ones(3)
counts = tuple(np.ones(block_size)*3)
displacement = tuple(np.arange(block_size)*3)
if block_rank==0:
send_data = gen_args(color, block_size).flatten()
print(send_data)
else:
send_data = None
block_comm.Scatterv([send_data,counts,displacement, MPI.DOUBLE], recv_args, root=0)
print(block_rank,recv_args)