1

In a C program using OpenMP, I want to set a flag when any thread (I don't need to know which one) meets a condition. If the flag variable is shared by all threads, and the flag is initialized to 0 (before the multi-thread part) and any thread will set the value to 1 or to 0 (all of them always to the same value), do I need a "#pragma omp atomic" directive?

For instance, the following code snippet:

//DataStruct is self defined data structure
function (DataStruct *data) {
  int i,flag=0;

  #pragma omp parallel for
  for(i=0;i<data->maxval;i++) {
    //Do stuff
    if (/*check condition*/) {
      //data->printMesage is 0 or 1, and doesn't change. It is fixed
      //before calling this function
      //data->printMesage is also an int variable
      flag=data->printMesage;
    }
  }
  //End of for loop. The code is running in
  //single thread from here
  if (flag) {
    //Print message
  }
}

It is necessary before the "flag=data->printMesage;" to add the "#pragma omp atomic" directive?

AwkMan
  • 670
  • 6
  • 18

2 Answers2

2

Even if the stored value is less than word size, you need to avoid the race condition of two threads reading and writing the same memory location. You will need a #pragma omp atomic write and #pragma omp atomic read pair to avoid the race condition. Because you cannot protect if(flag) {...} with the atomic construct, you will have to introduce a temp variable to read the flag into:

#pragma omp atomic read
tmp = flag
if (tmp) { ... }

In addition, you might need to make the memory view of the threads consistent by either using the flush construct or by adding the seq_cst (sequential memory consistency) or a pair of acquire and release clauses to the atomic construct.

Michael Klemm
  • 2,658
  • 1
  • 12
  • 15
  • The if(flag) at the end is outside the for loop, it is executed in single thread mode, therefore there is no race condition at read time. I'll edit the question to make it clearer – AwkMan Apr 27 '19 at 15:11
  • 1
    Ah, right. Thanks for editing. I did not see this. There's an implicit flush at the end of the parallel region, so the master thread will see the updated flag. The `atomic` construct will still be needed if multiple threads will write to `flag`. – Michael Klemm Apr 27 '19 at 15:37
2

Given that you only need the shared result after the parallel region, you can use a reduction instead of the atomic.

#pragma omp parallel for reduction(max:flag)
for(i=0; i<data->maxval; i++) {

Both solutions are perfectly fine though. Only if you were to set the flag potentially very often, the reduction has a performance benefit.

Zulan
  • 21,896
  • 6
  • 49
  • 109
  • The flag is not very often triggered, and many cases only one the iteration will set the flag, but still it is an interesting answer. In this particular case, if only one of the threads sets the flag, the other threads will not be stopped when setting the flag, as they will not enter that part of the code, right? – AwkMan Apr 27 '19 at 17:38
  • With the reduction, the other threads will not be able to check the global flag, as each thread will have a private copy of `flag` and thus will not see updates by other threads. The `reduction` will create a global view of `flag` for the sequential region, but not for the other threads. – Michael Klemm Apr 28 '19 at 06:46
  • @AwkMan I'm assuming that you **do not read `flag`** within the parallel region. And yes, with the `reduction` all threads are completely independent. The same is true for `atomic` though, except for microarchitectural details. – Zulan Apr 28 '19 at 12:58