0

While trying to parallelize a code in C++, I observed that the results changes every time I run the code. Investing within the code, I dind't find any issue and, at the contrary, I noticed that inverting the order with which I sum some numbers the resulting sum changes.

An example is shown below, in which I take the same numbers in a different order and I calculate the sum (simulating a loop).

int main()
{
     double XX1 = 0;
     XX1 = XX1 + 364.29129656036;
     XX1 = XX1 + 364.363465344009;
     XX1 = XX1 - 364.29129656039;
     XX1 = XX1 - 364.363465344038;

     double XX2 = 0;
     XX2 = XX2 - 364.29129656039;
     XX2 = XX2 - 364.363465344038;
     XX2 = XX2 + 364.29129656036;
     XX2 = XX2 + 364.363465344009;

     cout.precision(27);
     cout << XX1 << endl;
     cout << XX2 << endl;

`    return 0;
}

XX1 = -5.89466253586579114198684692e-11
XX2 = -5.88897819397971034049987793e-11

Do you know if:

  • Is this normal?
  • Is there a way to avoid it? Consider that, because of parallelisation, the sum will be done in a "random" order, so there is no way to follow always the same order.

Thanks in advance.

πάντα ῥεῖ
  • 1
  • 13
  • 116
  • 190
FGP92
  • 43
  • 6
  • 3
    For why order can matter, see: https://stackoverflow.com/questions/588004/is-floating-point-math-broken – NathanOliver Sep 20 '22 at 18:04
  • _" the results changes every time I run the code."_ I find that claim suspicious. Do you mean that when you change the code, the results change? Or are you _really_ seeing different results with each run of this exact code? – Drew Dormann Sep 20 '22 at 18:06
  • @DrewDormann He's talking about a parallelized version that isn't shown; the assumption is that different runs of the parallelized version are like different orderings in this code. – Nelfeal Sep 20 '22 at 18:10
  • 2
    Side note: precision(27); is way too much for a double. 15 or 16 significant figures is about what you can meaningfully expect from a double. Floating point is an approximate representation. So you are dealing with approximations.You might also want to familiarize yourself with an issue know as "the small difference of large numbers" Your answers have about 2 meaningful digits, the rest of what you are displaying are meaningless. So both your answers are -5.9e-11 and match within the limits of accuracy you have. – Avi Berger Sep 20 '22 at 18:17
  • 3
    1. yes this is normal because FP operations are not associative. 2. the naive sum is not numerically stable. 3. you need to use a static partitioning so to get deterministic results with multiple threads. Note that huge non-deterministic results are generally due numerical instability. The most famous one is the [**catastrophic cancellation**](https://en.wikipedia.org/wiki/Catastrophic_cancellation). This is your case. Better summation algorithm like Kahan help but you should really avoid doing such subtraction. – Jérôme Richard Sep 20 '22 at 18:19
  • `cout.precision(27);` wishful thinking. – Marek R Sep 20 '22 at 18:19
  • @Nelfeal you are right. I have developed a parallelised tool that is quite long to be published here. In this code, I saw that everytime I build and run the same .cpp file I get some really small changes in some output variables (e.g. instead of 1.1e-15 I get 1.12e-15). Running with a single thread, instead, the result is always the same. Initially I was thinking that there was something wrong in the mutex, or in the threads management, but after hours of debugging I realised that these differences are due to something similar to what I have shown above. – FGP92 Sep 20 '22 at 18:50
  • @AviBerger thank for the answer. I know that it makes no sense to see numbers after 15th or 16th digit, but seeing these changes I tought in some errors in the threads management. But unfortunately, after checking literaly everything in the code I dind't find any issue. But then, I discovered that this FP operation is not associative and I realised the reason of the issue. Thanks – FGP92 Sep 20 '22 at 18:54
  • @JérômeRichard thanks a lot for the answer. Can you indicate a good way to perform this sum in a deterministic way? I would need at least to confirm that the parallelisation is working correctly. Thanks in advance. – FGP92 Sep 20 '22 at 18:55
  • Due to limited precision, there are only 15 or 16 decimal digits of precision for a double. However, for a `double` the full value *as represented* in the `double` may require a lot more digits to express the entire value without rounding. I'd `cout.precision(80);` and `cout << std::fixed << XX1 << "\n";` (or dump it in floating hexadecimal, or dump it in floating binary). – Eljay Sep 20 '22 at 18:55
  • I advise you to use OpenMP to do that since it is simple. You just need something like `#pragma omp parallel for reduction(+,your_reduction_variable) schedule(static)` before the summation loop. As for the numerical instability, you need to split the loop in chunks and do a precise summation on each chunks (though it should not fix the catastrophic cancellation which is the main problem). – Jérôme Richard Sep 23 '22 at 16:30

0 Answers0