Strange behavior of program in GNU C++, using floating-point numbers

Question

Look at this program:

#include <iostream>
#include <cmath>

using namespace std;

typedef pair<int, int> coords;

double dist(coords a, coords b)
{
    return sqrt((a.first - b.first) * (a.first - b.first) +
              (a.second - b.second) * (a.second - b.second));
}

int main()
{
    coords A = make_pair(1, 0);
    coords B = make_pair(0, 1);
    coords C = make_pair(-1, 0);
    coords D = make_pair(0, -1);

    cerr.precision(20);
    cerr << dist(A, B) + dist(C, D) << endl;
    cerr << dist(A, D) + dist(B, C) << endl;

    if(dist(A, B) + dist(C, D) > dist(A, D) + dist(B, C))
    {
        cerr << "*" << endl;
    }
    return 0;
}

Function dist returns distance betweeen two points. A, B, C, D are corners of square.

It should be dist(A, B) == dist(B, C) == dist(C, D) == dist(D, A) == sqrt(2).

And dist(A, B) + dist(C, D) == dist(A, D) + dist(B, C) == 2 * sqrt(2)

I am using GNU/Linux, i586, GCC 4.8.2.

I compile this program and run:

$ g++ test.cpp ; ./a.out 
2.8284271247461902909
2.8284271247461902909
*

We see, that program outputs equal values of dist(A, B) + dist(C, D) and dist(A, D) + dist(B, C), but condition dist(A, B) + dist(C, D) > dist(A, D) + dist(B, C) is true!

When I compile with -O2, its look OK:

$ g++ test.cpp -O2 ; ./a.out 
2.8284271247461902909
2.8284271247461902909

I think, it is a gcc-bug, and it is not directly related to floating-point operation accuracy, because in this case values dist(A, B) + dist(C, D) and dist(A, D) + dist(B, C) MUST BE equal (each of them is sqrt(2) + sqrt(2)).

When I change function dist:

double dist(coords a, coords b)
{
    double x = sqrt((a.first - b.first) * (a.first - b.first) + (a.second - b.second) * (a.second - b.second));
    return x;
}

the program runs correct. So the problem not in floating-point representation of numbers, but in the gcc code.

Edit:

Simplified example for 32-bit compiler:

#include <iostream>
#include <cmath>
using namespace std;
int main()
{
    if (sqrt(2) != sqrt(2))
    {
        cout << "Unequal" << endl;
    }
    else
    {
        cout << "Equal" << endl;
    }
    return 0;
}

Run without optimization:

$ g++ test.cpp ; ./a.out 
Unequal

Run with -ffloat-store:

$ g++ test.cpp -ffloat-store ; ./a.out 
Equal

Solution:

Probably, it is "not a bug" in GCC #323: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=323

Compiling with -ffloat-store solve the problem.

The duplicate link's main answer isn't great, but it's enough for you to get the general idea and supplies links for further reading.... — Tony Delroy, Nov 12 '14 at 11:11
I know the basics of floating point arithmetics and IEEE754 standard, I understand, why 0.1 + 0.2 != 0.3. But in this case values MUST be equal, and we see, that there is a difference in compiler behavior when using -O0 and -O2. — Denis Kirienko, Nov 12 '14 at 11:17
(This is not a duplicate.) The CPU calculates with greater precision than `double` internally. I suspect that in the unoptimised case, *one* of the sides of the comparison gets temporarily stored in a register with less precision, possibly because only one side gets inlined, and then compared at the greater precision (you'll need to look at the assembly to be sure). If that is so, I would consider this a compiler bug. — molbdnilo, Nov 12 '14 at 11:26
@DenisKirienko: trying to reason about this as you are is flawed... the compiler can e.g. for x86-family CPUs truncate 80-bit floating pointer register values down to 64-bit at different times depending on unspecified order of evaluation, optimisation etc.. Same basic issue. It's not a compiler bug... the Standard allows this. More specifically, each of the return values of `dist()` might or might not be truncated to 64-bit before being summed, and it could be only one side is truncated before the comparison. Storing temporary results back to memory normally causes truncation. — Tony Delroy, Nov 12 '14 at 11:38
@DenisKirienko - `But in this case values MUST be equal` When you use floating point, there is never a "MUST be equal". — PaulMcKenzie, Nov 12 '14 at 11:44
@tony-d: Yes, probably you are right, and I read the article (http://www.parashift.com/c++-faq-lite/floating-point-arith2.html). where is the same problem described. Thank you. — Denis Kirienko, Nov 12 '14 at 11:46
@maxim-yegorushkin sets correct duplicate, thank you! Compiling with -ffloat-store also changes output of program. — Denis Kirienko, Nov 12 '14 at 11:52
@PaulMcKenzie: "never a "MUST be equal"" - I actually argued (I believe correctly) the opposite on another question today... scenario `double d; std::istringstream iss("-1.0"); iss >> d; d == -1.0;`. That should never break, as the stream->`double` conversion has no excuse for not being exact and the values are exact well short of the epsilon level. But, not sure I'd bet my life on it without spending an hour with my nose in the Standard (and a few more with my compiler source code). — Tony Delroy, Nov 12 '14 at 11:58
I don't see where the cited question really addresses the issue. There is a real issue of conformance here: can a compiler use extended precision when returning a value, which is by definition initialization of a variable of the return type. (The standard is clear that assignment or a cast must force the value to the required precision. But if initialization doesn't: `double x = ...; return x;` doesn't guarantee the correct precision either; you'd need `double x; x = ...; return x;` Or just `static_cast(...)`. Of course, most compilers need special flags to be conformant. — James Kanze, Nov 12 '14 at 12:00
@TonyD At least in C: "The accuracy of the floating-point operations (+, -, *, /) and of the library functions in and that return floating-point results is implementation defined, as is the accuracy of the conversion between floating-point internal representations and string representations performed by the library functions in , , and . The implementation may state that the accuracy is unknown." (From a QoI point of view, of course...) — James Kanze, Nov 12 '14 at 12:02
_"I think, it is a gcc-bug, and it is not directly related to floating-point operation accuracy, because in this case values dist(A, B) + dist(C, D) and dist(A, D) + dist(B, C) MUST BE equal"_ You just contradicted yourself in a single sentence. Congratulations! — Lightness Races in Orbit, Nov 12 '14 at 12:04
Gcc is probably promoting to plain floats the int values before doing the computation, and then convert the result of sqrt to double. In the optimised case, it notices the two step conversion and reduces it to a single step conversion to double, before any computation is done. Is it really a bug, or does the standard mandate a promotion to the closest type? — didierc, Nov 12 '14 at 12:05
I'm pretty sure this is a duplicate, but it's *not* a duplicate of the things it's been marked a duplicate of. This is a well-known and annoying behaviour of almost all compilers. — tmyklebu, Nov 12 '14 at 15:00
possible duplicate of [Why does returning a floating-point value change its value?](http://stackoverflow.com/questions/16888621/why-does-returning-a-floating-point-value-change-its-value) — tmyklebu, Nov 12 '14 at 15:03
@Denis Kirienko: Side remark: consider using the standard math function `hypot()` to implement `dist()`. — njuffa, Nov 12 '14 at 15:50
Floating point math in C++ does not even require `a == a` to be true; let alone the more complicated expressions you have posted. It's certainly not a compiler bug. — M.M, Nov 14 '14 at 23:33

LThode · Accepted Answer · 2015-11-16T17:17:20.797

This seemingly strange behavior is due to the way the old x87 floating point unit works: it uses an 80-bit long double type with 64 bits of precision as its register format, while temporary doubles are 64 bits long with 53 bits of precision. What happens is that the compiler spills 1 of the sqrt(2) results to memory (as sqrt returns a double, this rounds to the 53-bit significand of that type) so that the FP register-stack is clear for the next call to sqrt(2). It then compares that loaded-from-memory 53-bits-of-precision value to the unrounded 64-bits-of-precision value coming back from the other sqrt(2) call, and they come up different because they are rounded differently, as you can see from this assembler output (annotations mine, used your second code snippet with the 2s changed to 2.0s for clarity and -Wall -O0 -m32 -mfpmath=387 -march=i586 -fno-builtin for compile flags on Godbolt):

main:
    # Prologue
    leal    4(%esp), %ecx
    andl    $-16, %esp
    pushl   -4(%ecx)
    pushl   %ebp
    movl    %esp, %ebp
    pushl   %ecx
    subl    $20, %esp
    # Argument push (2.0)
    subl    $8, %esp
    movl    $0, %eax
    movl    $1073741824, %edx
    pushl   %edx
    pushl   %eax
    # sqrt(2.0)
    call    sqrt
    # Return value spill
    addl    $16, %esp
    fstpl   -16(%ebp)
    # Argument push (2.0)
    subl    $8, %esp
    movl    $0, %eax
    movl    $1073741824, %edx
    pushl   %edx
    pushl   %eax
    # sqrt(2.0)
    call    sqrt
    addl    $16, %esp
    # Comparison -- see how one arg is loaded from a spill slot while the other is
    # coming from the ST(0) i387 register
    fldl    -16(%ebp)
    fucompp
    fnstsw  %ax
    # Status word interpretation
    andb    $69, %ah
    xorb    $64, %ah
    setne   %al
    testb   %al, %al
    # The branch was here, but it and the code below was snipped for brevity's sake

Verdict: the x87 is the weirdo. -mfpmath=sse is the definitive fix for this behavior -- it'll make GCC define FLT_EVAL_METHOD to 0, as the SSE(2) floating point support is single/double only. The-ffloat-store switch works around it as well for this program, but it's not recommended as a general purpose workaround -- it'll make your program slower due to the extra spills/fills, and doesn't work in all cases. Of course, going to a 64bit CPU/OS/compiler combination fixes this too, because the x86-64 ABI uses SSE2 for floating-point math by default.

`-mfpmath=sse` is a definite fix: it makes GCC define `FLT_EVAL_METHOD` as 0 and (more or less) stick to it. `-ffloat-store` is a workaround that fixes this particular program but doesn't work in all cases. Examples of programs not fixed by it are provided in http://arxiv.org/abs/cs/0701192 (a survey written just before SSE2 became widely supported in compilers and C99 widely implemented, that shows how bad the situation was then. It has improved now) — Pascal Cuoq, Nov 15 '14 at 02:38
Agreed that `-mfpmath=sse` is the best way to deal with this -- I just call it a workaround because it ignores the x87 unit entirely. Sidenote: turning the precision setting in the x87 control word down to 53bits *doesn't* work, because that doesn't limit the exponent range to the same as `double`, just the significand length. — LThode, Nov 17 '14 at 15:03
And setting the x87 control word PLUS `-ffloat-store` together still do not guarantee exact IEEE 754 binary64 results because of the denormals that can be double-rounded once to 53-bit precision then the smaller effective precision, as opposed to the single rounding mandated by IEEE 754. This cannot happen with add/sub, but it makes IEEE 754 multiplication rather expensive to emulate when all you have is an x87, as shown in http://stackoverflow.com/a/18498462/139746 — Pascal Cuoq, Nov 17 '14 at 15:12
@Ruslan -- my understanding is that the leading 1 is explicit in x87 extended precision, hence you only get 63 usable bits of precision, while it is implicit in IEEE double precision. — LThode, Nov 16 '15 at 13:47

Strange behavior of program in GNU C++, using floating-point numbers

Edit:

Solution:

1 Answers1