1

I am trying to understand how OMP treats different for loop declarations. I have:

int main()
{
   int i, A[10000]={...};
   double ave = 0.0;
   #pragma omp parallel for reduction(+:ave)
   for(i=0;i<10000;i++){
       ave += A[i];
   }

   ave /= 10000;

   printf("Average value = %0.4f\n",ave);
   return 0;
}

where {...} are the numbers form 1 to 10000. This code prints the correct value. If instead of #pragma omp parallel for reduction(+:ave) I use #pragma omp parallel for private(ave) the result of the printf is 0.0000. I think I understand what reduction(oper:list) does, but was wondering if it can be substituted for private and how.

  • Short answer: No you can't. Longer one: well you could do something of this kind, but why would you want to do that? (hint: this is likely to be a bad idea) – Gilles Mar 14 '18 at 13:38
  • @Gilles The point is I saw it in someone else's code. That code produces the correct result, so it made me think. Moreover why is it printing 0? – Peter Hristov Mar 14 '18 at 13:41
  • So what do you expect as answer? An example of your code printing the right value without using the `reduction` clause? – Gilles Mar 14 '18 at 13:50
  • @Gilles More specifically one using private if possible. Thanks! – Peter Hristov Mar 14 '18 at 14:03
  • There are several ways to do a reduction by hand https://stackoverflow.com/q/35675466/2542702. I think it's a very useful exercise to go through for educational purposes. Especially in cases where operations don't commute which the `reduction` clause does not support. – Z boson Mar 14 '18 at 15:07
  • @Zboson Looks like a useful read. I will go through it. Thank you. – Peter Hristov Mar 15 '18 at 08:19

1 Answers1

5

So yes, you can do reductions without the reduction clause. But that has a few downsides that you have to understand:

  1. You have to do things by hand, which is more error-prone:
    • declare local variables to store local accumulations;
    • initialize them correctly;
    • accumulate into them;
    • do the final reduction in the initial variable using a critical construct.
  2. This is harder to understand and maintain
  3. This is potentially less effective...

Anyway, here is an example of that using your code:

int main() {
   int i, A[10000]={...};
   double ave = 0.0;
   double localAve;

   #pragma omp parallel private( i, localAve )
   {
       localAve = 0;
       #pragma omp for
       for( i = 0; i < 10000; i++ ) {
           localAve += A[i];
       }
       #pragma omp critical
       ave += localAve;
   }

   ave /= 10000;

   printf("Average value = %0.4f\n",ave);
   return 0;
}

This is a classical method for doing reductions by hand, but notice that the variable that would have been declared reduction isn't declared private here. What becomes private is a local substitute of this variable while the global one must remain shared.

Gilles
  • 9,269
  • 4
  • 34
  • 53
  • Among the problems solved for you by reduction clause is the question of efficient accumulation of individual thread sums (for example, by a tree method to get a little parallelism). – tim18 Mar 14 '18 at 20:37
  • That makes sense. I will go through the original code (the one mentioned in previous comments) again to try and find what exactly is going on. It uses #pragma omp parallel for private... and no critical. I understand the principle of operation for each construct separately, just need to piece them together. – Peter Hristov Mar 15 '18 at 08:14