0

I'm new to openMP and C, I tried the "Introduction to OpenMP - Tim Mattson (Intel)" Pi example but the outcome is not 3.14. I compare the code with teacher. They are same. But the result is different

#include <omp.h>
#include <stdio.h>
#include <stdlib.h>

//OpenMP example program: hello;
static long num_steps = 100000;
#define NUM_THREADS 2
double step;

int main()
{
    int nnum,i,j=0;
    step= 1.0/(double)num_steps;
    double sum[NUM_THREADS];
    double x,pi,result=0.0;

    omp_set_num_threads(NUM_THREADS);

    #pragma omp parallel
    {
        int id=omp_get_thread_num();
        int num=omp_get_num_threads();
        if(id==0) nnum = num;

        for(i=id,sum[id]=0.0;i<num_steps;i=i+num)
        {       
            x=(i+0.5)*step;
            sum[id]=sum[id]+(4.0/(1.0+x*x));
        }
    }

    while(j<nnum)
    {
        printf(" is %2.4f \n",sum[j]);
        result=result+sum[j];
        j++;
    }

    pi = step*result;
    printf("the result is %f \n",pi);

    return 0;
}
Ronny Brendel
  • 4,777
  • 5
  • 35
  • 55
Hoa
  • 11
  • 2
  • 2
    I don't quite believe you've copied that faithfully, you're missing `#pragma omp for` in the line before the `for` statement. – High Performance Mark Aug 13 '15 at 15:46
  • 2
    Actually, now that I've had a closer look I think my previous comment was incorrect -- but the code your teacher gave you is horrible. It fails to use the built-in OpenMP feature of *reduction* where it would be most appropriate, and the distribution of work to threads is specified in the `for` statement -- that is a matter best left to the OpenMP run-time. Throw that code away and start afresh. – High Performance Mark Aug 13 '15 at 16:16

1 Answers1

4

The code is wrong. The "i" variable is being shared by the threads and both increment it, so that effectively you're doing only 1/NUM_THREADS of the intended iterations.

There are three different way to fix it. The first is to write

#pragma omp parallel private(i)

This makes each thread use a separate copy of the variable. The second is to declare i inside the #pragma omp parallel, which has the same effect (see how id is already private in your code).

The third and more interesting is to change the for statement to

#pragma omp for 
    for(i=0;i<num_steps;i++)

This makes the OpenMP compiler look at the loop and say "ok, this is a loop with num_steps iterations". It will then produce code to split the 0..num_steps-1 range into one or more chunks and pass each of them to one of NUM_THREADS threads. One thread for example will process 0 to 49999, the other will process 50000 to 99999. It is important to notice that:

  • without #pragma omp for, the for loop specifies the indices for each separate thread, hence the iteration variable i must be private

  • with #pragma omp for, the for loop specifies the indices for the whole loop, and the iteration variable i does not have to be private because OpenMP will create a separate thread-private iteration variable on its own.

Paolo Bonzini
  • 1,900
  • 15
  • 25