If I have the following case:
bool cond_var;
#pragma omp parallel shared(cond_var)
{
bool some_private_var;
// ...
do {
#pragma omp single
{
cond_var = true;
}
// do something, calculate some_private_var;
// ...
#pragma omp atomic update
cond_var &= some_private_var;
// Syncing step
// (???)
} while(cond_var);
// ... (other parallel stuff)
}
I want my do-while loop to have the same number of iterations for all my threads, but when I tried #pragma omp barrier
as the syncing step (just before the end of the loop), I have ended with a deadlock. Printing the value of cond_var
showed that some threads saw it as true
while others saw it as false
, so the loop finished for some, leaving the others deadlocked on the barrier. Then I have tried various combinations and ordering of barrier
and flush
, with no luck (with some combinations, the deadlock was postponed).
How to properly combine and sync the loop condition among the threads so the all the loops will have the same number of iterations?
UPDATE
I have also tried loading the value of cond_var
to another private variable with #pragma atomic read
, and testing that condition. It also didn't work. Apparently, atomic read guarantee I have a consistent value (either old or new), but doesn't guarantee it is the latest.
UPDATE 2
Based on code Jonathan Dursi's code, this is an MVCE that look more like what I am trying to do:
#include <omp.h>
#include <cstdio>
#include <random>
#include <chrono>
#include <thread>
int main() {
bool cond_var;
const int nthreads = omp_get_max_threads();
#pragma omp parallel default(none) shared(cond_var)
{
bool some_private_var;
std::random_device rd;
std::mt19937 rng(rd());
unsigned iter_count = 0;
/* chance of having to end: 1 in 6**nthreads; all threads must choose 0 */
std::uniform_int_distribution<int> dice(0,5);
const int tid = omp_get_thread_num();
printf("Thread %d started.\n", tid);
do {
++iter_count;
#pragma omp once shared(cond_var)
{
// cond_var must be reset to 'true' because it is the
// neutral element of &
// For the loop to end, all threads must choose the
// same random value 0
cond_var = true;
}
some_private_var = (dice(rng) == 0);
// If all threads choose 0, cond_var will remain 'true', ending the loop
#pragma omp atomic update
cond_var &= some_private_var;
#pragma omp barrier
} while(!cond_var);
printf("Thread %d finished with %u iterations.\n", tid, iter_count);
}
return 0;
}
Running with 8 threads in a machine with enough logical cores to run all of them simultaneously, most runs deadlock in the first iteration, although there was one run that finished correctly on the second iteration (not conforming with the chances of 1 in 1679616 (6**8) of having all threads choosing 0).