Once the program has executed a parallel section, all the threads except for the master get destroyed. As the program enters another parallel section, threads are spawned again. All this routine causes some overhead which may significantly limit the efficiency of multiprocessing.
Consider a nested loop with the inner one being parallelized:
for (int i = 0; i < outer_loops; i++){
// some code which does not affect the execution time noticeably
#pragma omp parallel for
for (int j = 0; j < inner_loops; j++) {
// some code to run through with multiple threads
}
}
Is there a way to avoid slave threads to be spawned anew each time the program enters the inner loop? For example, once the inner loop is done, the master thread takes care of the outer loop, but the slave threads do not get destroyed to be spawned again, they just wait for the next iteration of the inner loop.
In my case, there is no way I can include the outer loop into the parallel section as it must be sequential. I'm trying to minimize the overhead caused by thread spawn.