0

I am trying to understand why the barrier is required to remove the race condtion?

#include<omp.h>
#include<stdio.h>

int main()
{
    int sum = 0;
    #pragma omp    parallel num_threads(4)
    for(int i = 0;i < 10;i++)
    {
        #pragma omp parallel for num_threads(4)
        for(int j = 0;j < 10;j++)
        {
            #pragma omp critical
            sum += 1;
        }

 // Uncommenting this barrier removes the race condition. Right now it is non-deterministic.
 //       #pragma omp barrier
        #pragma omp single
        sum += 1;
    }
    printf("%d", sum);
}
Niteya Shah
  • 1,809
  • 1
  • 17
  • 30
  • Note that there is no need to use a `critical` section or a `barrier` if you use reduction(s). It can be used in nested parallelism as well, just add `reduction(+:sum)` clause to your parallel constructs. – Laci Oct 18 '22 at 17:45
  • This is more of a theoretical doubt than an implementation problem. – Niteya Shah Oct 18 '22 at 17:46

1 Answers1

1

Without the "pragma omp barrier", another thread might concurrently be accessing the same sum variable inside the "pragma omp critical" section. This would lead to undefined results.

The barrier forces all threads to finish the inner for loop, and then a single thread can proceed to do the last section without risk of any race condition.

Sven Nilsson
  • 1,861
  • 10
  • 11
  • So the `critical` is not global, but only for the nested team of threads? – Victor Eijkhout Oct 18 '22 at 16:16
  • I don't think that critical is local as if I remove both the single and the sum following it, it is always 400, so critical should be global. – Niteya Shah Oct 18 '22 at 16:37
  • @VictorEijkhout `critical` is global, but it will not prevent another part of the program to change the variable. The binding thread set for a critical region is all threads in the contention group. *Contetion group* is an initial thread and its descendent threads. – Laci Oct 18 '22 at 16:47
  • @Laci I am not sure if I understand what you said perfectly. https://stackoverflow.com/questions/20441123/the-behavior-of-omp-critical-with-nested-level-of-parallelism#:~:text=Critical%20regions%20in%20OpenMP%20have,they%20occur%20in%20the%20code. says that it is true for all threads, not just descendant threads. – Niteya Shah Oct 18 '22 at 16:53
  • I quoted from the OpenMP specification, but it means the same (all threads in a program) – Laci Oct 18 '22 at 17:01
  • "critical" is indeed global, but since the sum increment below the barrier lacks a "critical" section we have a race condition anyway – Sven Nilsson Oct 18 '22 at 17:20
  • @SvenNilsson No, there is no race condition here, because of the `single` construct -- only one thread executes it (note also that there is an explicit barrier before and an implicit one after it). – Laci Oct 18 '22 at 17:26
  • The specification seems to indicate that the thread that is elected to execute the "pragma omp single" section can proceed doing so before the other threads have started waiting at the implicit barrier. The explicit barrier is commented away. – Sven Nilsson Oct 19 '22 at 15:59