1

I have newly started learning openmp programming but got stuck in a piece of code which tries to parallelize the program for calculation of pi. I'm unable to undertand what this line do in the program and meaning of the followed comment.

  if (id == 0) nthreads = nthrds; //Only one thread should copy the number of threads to the global value to make sure multiple threads writing to the same address don’t conflict.

The entire code is:

#include<omp.h>
#include<stdio.h>
#define NUM_THREADS 2

static long num_steps = 100000;
double step;

int main ()
{
   int i, nthreads;
   double pi, sum[NUM_THREADS];
   step = 1.0/(double) num_steps;

   omp_set_num_threads(NUM_THREADS);
   double time1 = omp_get_wtime();
   #pragma omp parallel
   {
       int i, id,nthrds;
       double x;
       id = omp_get_thread_num();
       nthrds = omp_get_num_threads();
       if (id == 0) nthreads = nthrds; //Only one thread should copy 
       the number of threads to the global value to make sure multiple 
       threads writing to the same address don’t conflict.

       for (i=id, sum[id]=0.0;i< num_steps; i=i+nthrds){
             x = (i+0.5)*step;
             sum[id] += 4.0/(1.0+x*x);
       }
   }
   double time2 = omp_get_wtime();

   for(i=0, pi=0.0;i<nthreads;i++)pi += sum[i] * step;
   printf("%lf\n",pi);
   printf("%lf\n",(time2-time1));

}

I tried to run without the if statement but it gave the value of pi 0 but ran correctly otherwise (gave 3.141593). When I tried to assign nthreads equal to total number of threads (ie 2) outside globally it still gave correct value of pi. Can anybody explain me the how there is difference in the ouput?

Thank you!!

Shaggy
  • 69
  • 1
  • 6

2 Answers2

3

The variable nthreads needs to be set for the summation step in the final loop

for(i=0, pi=0.0;i<nthreads;i++)pi += sum[i] * step;

Removing the assignment will break this loop. Let me try to reformulate the comment why you cannot simply do

nthreads = nthrds;

If you write to a shared memory location from multiple threads without any protection, the value may be wrong. However, typically one typically uses atomic as protection. In this case a #pragma omp single nowait would be much more appropriate. I guess the idea behind writing this variable dynamically instead of just using NUM_THREADS is that you might not always have it guaranteed.

Anyway. This tutorial is highly problematic. It tries to teach OpenMP using raw primitives rather than using the proper idiomatic high level tools. This causes lots of confusion. I believe this is a bad approach to teaching OpenMP, especially if you don't follow the tutorial to the very end.

The proper way to do it is actually given later in the tutorial (with some modernization by me):

double sum = 0.0;
int step = 1.0/(double) num_steps;
omp_set_num_threads(NUM_THREADS);
#pragma omp parallel for reduction(+:sum)
for (int i=0; i < num_steps; i++) {
    double x = (i+0.5)*step;
    sum = sum + 4.0/(1.0+x*x);
}
double pi = step * sum;
Zulan
  • 21,896
  • 6
  • 49
  • 109
1

When you tried to assign nthreads equal to the total number of threads (ie. 2) outside globally, it still gave correct value of pi because the number of threads that you asked from the computer was given to you (ie in your case it is 2) but what if you asked for 1 million threads, the computer may not give you this many threads. So to know how many threads are allocated to you, you need to write this piece of code.

    if (id == 0) nthreads = nthrds;