Comparing doubles yields one less AND equal to the other

Question

So I have the following code:

 double which_min(const int i, const int j, const int nx,
                  const double step, const double local_cost,
                  double *D)
 {
      double tuple[3];

      // DIAG, LEFT, UP
      tuple[0] = (D[d2s(i-1, j-1, nx)] == NOT_VISITED) ? DBL_MAX : D[d2s(i-1, j-1, nx)] + step * local_cost;
      tuple[1] = (D[d2s(i, j-1, nx)] == NOT_VISITED) ? DBL_MAX : D[d2s(i, j-1, nx)] + local_cost;
      tuple[2] = (D[d2s(i-1, j, nx)] == NOT_VISITED) ? DBL_MAX : D[d2s(i-1, j, nx)] + local_cost;
      /*
      if (i == 83 && j == 124) printf("less? %d\n", tuple[1] < tuple[0]);
      if (i == 83 && j == 124) printf("equal? %d\n", tuple[1] == tuple[0]);
      if (i == 83 && j == 124) printf("greater? %d\n", tuple[1] > tuple[0]);
      */
      int min = (tuple[0] <= tuple[1]) ? 0 : 1;
      min = (tuple[min] <= tuple[2]) ? min : 2;

      if (i == 83 && j == 124) printf("min = %d\n", min + 1);

      return ((double) min + 1.0);
 }

For the problematic case I'm working with, when step = 1, i = 83 and j = 124, I'm getting

min = 2.

However, if I uncomment the other if statements, I get the correct value of min, but...

less? 1
equal? 1
greater? 0
min = 1

I know comparing doubles can be finicky, and I read this answer that mentions moving from 80-bit register to 64-bit memory, so I'm guessing the printf statements could be causing something like that. However,

It still feels counter intuitive that tuple[1] is both less AND equal to tuple[0], how is that possible?*
Can I change this so that it gives the same result consistently?

And by the way, this only happens in a 32-bit system, the 64-bit version doesn't give me any problems.

EDIT: I guess that the first printf results in a loss of precision so that the == comparison evaluates to 1 afterwards, but still, the main question is whether I can make the result consistent.

EDIT2: Indeed, changing the order of if statements to

 if (i == 83 && j == 124)
 {
      printf("equal? %d\n", tuple[1] == tuple[0]);
      printf("less? %d\n", tuple[1] < tuple[0]);
      printf("greater? %d\n", tuple[1] > tuple[0]);
 }

Results in

equal? 0
less? 0
greater? 0
min = 1

And the values I get with printf(%a) are

tuple[0] = 0x1.2594a8056d275p+5
tuple[1] = 0x1.2594a8056d275p+5
tuple[2] = 0x1.fffffffffffffp+1023

"how is that possible?" Because of a compiler bug/non-srandard floating point arithmetic? What are your compiler and compilation flags? — n. m. could be an AI, Sep 27 '16 at 14:25
Can you make a simple example with the exact values of tuple[1] and tuple[0]? Something like `double a = X; double b = Y; printf("%d %d %d\n", a < b, a == b, a > b);` — Art, Sep 27 '16 at 14:28
Perhaps you are using a compiler option aimed at providing faster DP arithmetic at the cost of occasional anomalies such as you describe. Several compilers offer such options. — John Bollinger, Sep 27 '16 at 14:35
As a comment to your edit. `printf` has nothing to do with this. There is no loss of precision. Printf receives an integer argument that's either 0 or 1, it is not involved in the comparison. — Art, Sep 27 '16 at 14:37
Consider also that it's a fairly serious problem for a computation to be sensitive to small differences between `double` values (relative to their magnitude). Floating-point arithmetic is *inexact* by nature. — John Bollinger, Sep 27 '16 at 14:37
Since `printf` changes the behavior of of the comparisons later (the calculation of `min`) I'm more inclined to believe that the problem is a buffer overflow of the `D` array or something like that. Floating point numbers are not magic, they behave predictably even when the compiler is allowed to do more with them than most people think and they don't behave like numbers in school. — Art, Sep 27 '16 at 14:39
I assume the `printf` causes what the other link mentions, moving a value from a register to memory... — Alexis, Sep 27 '16 at 14:42
@Alexis yeah, I was just about to take that statement back. saving the value around the call to printf to the stack would also explain this behavior. — Art, Sep 27 '16 at 14:43
The key point here is that you are seeing behavior that appears to be inconsistent with the requirements of the standard. Supposing that your program is not exhibiting undefined behavior -- from overrunning array bounds, for example -- that is not a problem that can be fixed in your code. Very likely, such a problem has to do with optimizations performed by your compiler; that's why @n.m. and I both brought up your compile options. — John Bollinger, Sep 27 '16 at 14:47
The gcc flag you're looking for is `-ffloat-store`. But it makes floating point math horribly slow. Just say that the code is only supported on `amd64` and newer machines. — Art, Sep 27 '16 at 14:47
Very well, I will investigate about the compiler options, I actually don't know the values being used (I call the C code through R). I just wanted to make sure there is nothing I can do to account for this behavior. — Alexis, Sep 27 '16 at 14:54
Ideas to narrow the problem: 1) re-iterate: review FP compiler options 2) Use `volatile double tuple[3];` to force compiler to not use extended precision intermediate results 3) Use `int lt = tuple[1] < tuple[0]; etc.` to force (encourage) the compare computation to occur before all the `printf()`s. 4) Watch for `(lt + eq + gt) != 1`. IMO, either the issue is #1 or compiler bug or UB elsewhere in code. — chux - Reinstate Monica, Sep 27 '16 at 15:36

Alexis · Accepted Answer · 2018-08-05T17:22:28.617

Indeed the results stored in tuple were being read with more precision than expected, causing some of the comparisons to yield contradicting results.

Adding the -ffloat-store flag to the compiler remedies the situation, but as mentioned in the comments, it might not be ideal, and apparently it isn't portable.

Declaring tuple as volatile seems to do the trick, but it had to be done as

volatile double *tuple = (double *)malloc(3 * sizeof(double));

And then

free((double *)tuple);

EDIT: apparently the above isn't necessary, using volatile double tuple[3]; is enough. In order to use it only in x32 compilations, I used the following typedef in C++:

#include <type_traits> // conditional
typedef std::conditional<sizeof(void*) == 4, volatile double, double>::type tuple_t;
tuple_t tuple[3];

Don't cast the result of `malloc()` and friends. It is unnecessary and can mask the absence of an appropriate prototype. — mlp, Aug 05 '18 at 20:54

Comparing doubles yields one less AND equal to the other

1 Answers1