2

I am trying to sum all the elements of an array which is initialized in the same code. As every element is independent from each other, I tried to perform the sum in parallel. My code is shown below:

int main(int argc, char** argv) 
{
  cout.precision(20);
  double sumre=0.,Mre[11];
  int n=11;
  for(int i=0; i<n; i++)
   Mre[i]=2.*exp(-10*M_PI*i/(1.*n));

  #pragma omp parallel for reduction(+:sumre)
  for(int i=0; i<n; i++)
  {
   sumre+=Mre[i];
  }
  cout<<sumre<<"\n";
}

which I compile and run with:

g++ -O3 -o sum sumparallel.cpp -fopenmp
./sum

respectively. My problem is that the output differs every time I run it. Sometimes it gives

2.1220129388411006488

or

2.1220129388411002047 Does anyone have an idea what is happening here?

J.J.
  • 53
  • 4
  • 2
    `double` only have 15–17 significant decimal digits precision. – kangshiyin Jun 10 '16 at 15:40
  • My main concern is that I am very interested in getting good precision, so I do not know how I could manage this by using OpenMP. – J.J. Jun 10 '16 at 15:41
  • 2
    Also relevant and covered in several SO Qs and As, is the non-associativity of f-p arithmetic and the non-determinism of parallel f-p arithmetic. One of my own modest contributions to the sum of knowledge -- http://stackoverflow.com/questions/32074644/fix-arithmetic-error-in-distributed-version/ -- and there are others. – High Performance Mark Jun 10 '16 at 16:43

1 Answers1

2

Some of these comments hint at the problem here, but there could be two distinct issues

Double precision numbers do not have 20 decimal precision

If you want to print the maximum precision of sumre, use something like this

#include <float.h>
int maint(int argc, char* argv[])
{
  ...

  printf("%.*g", DBL_DECIMAL_DIG, number);

  return 0;
}

Floating point arithmetic is non-commutative

The effect of this property is roundoff error. In fact, the function that you have defined, a gaussian, is especially prone to roundoff for summation. Considering that workload distribution of OpenMP parallel for is undefined, you may receive different answers when you run it. To get around this you may use the kahan summation algorithm. An implementation with OpenMP would look something like this:

  ...
  double sum = 0.0, c = 0.0; 

  #pragma omp parallel for reduction(+:sum, +:c) 
  for(i = 0; i < n; i++) 
  { 
    double y = Mre[i] - c; 
    double t = sum + y; 
    c = (t - sum) - y; 
    sum = t; 
  } 

  sum = sum - c;  
  ...
Gavin Portwood
  • 1,217
  • 8
  • 9