0

I have a C++ program that uses MPI and openMP and goes something like this:

Some_stuff_done_by_all_ranks();
Comm.Barrier(); //wait until stuff is done by all ranks
if (rank==0)
 {
  very_slow_function(); //but efficiently parallelizable with openMP
 }
Comm.Barrier() //All ranks wait until the function is done
Comm.Bcast(result_of_for_loop,0) // Bring the result of very_slow_function() from rank 0 to other ranks
Some_more_stuff_done_by_all_ranks();

When the very_slow_funtion() is running, all other MPI processes are waiting for its result, so basically all these processors are not doing anything. I was wondering if there is a way for rank 0 to temporarily get the resources allocated to the rest of MPI processes so that this function can run faster.

Thanks a lot,

Joan

user3091644
  • 419
  • 5
  • 7
  • Usually, the MPI wait is spinning, so the CPU usage is still 100%. – Matthieu Brucher Nov 06 '18 at 15:32
  • Are you oversubscribing your cores with threads? Check out these two questions: https://stackoverflow.com/questions/37078753/prevent-mpi-from-busy-looping https://stackoverflow.com/questions/37915579/efficiently-gathering-scattering-tasks – Zulan Nov 06 '18 at 15:58
  • Hi all, Matthieu) that's a good point, I didn't think about it. Do you know if there is a way to actually free these resources and send a signal from rank 0 for the other ranks to grab them again once the work is done? Zulan) No I am not oversubscribing. Each MPI rank has a certain number of cores, but every once in a while most of them are just waiting, so I was wondering if those resources can be temporarily transferred to rank 0 – user3091644 Nov 06 '18 at 18:22
  • 1
    There is nothing such as signals in the MPI standard. Are you suggesting `very_slow_function()` is parallelized with OpenMP but not MPI ? An option is to use an intra-node communicator, shared memory and one sided operations so all the cores can be used. Of course, you always have the option to parallelize this function with MPI. – Gilles Gouaillardet Nov 06 '18 at 23:58

0 Answers0