I have a C++ program that uses MPI and openMP and goes something like this:
Some_stuff_done_by_all_ranks();
Comm.Barrier(); //wait until stuff is done by all ranks
if (rank==0)
{
very_slow_function(); //but efficiently parallelizable with openMP
}
Comm.Barrier() //All ranks wait until the function is done
Comm.Bcast(result_of_for_loop,0) // Bring the result of very_slow_function() from rank 0 to other ranks
Some_more_stuff_done_by_all_ranks();
When the very_slow_funtion() is running, all other MPI processes are waiting for its result, so basically all these processors are not doing anything. I was wondering if there is a way for rank 0 to temporarily get the resources allocated to the rest of MPI processes so that this function can run faster.
Thanks a lot,
Joan