6

The following code will output different results for variables 'e' and 'f' on a x86 32 bit machine but the same results on a x86 64 bit machine. Why? Theoretically the same expression is being evaluated, but technically it is not.

#include <cstdio>
main()
{
  double a,b,c,d,e,f;
  a=-8988465674311578540726.0;
  b=+8988465674311578540726.0;
  c=1925283223.0;
  d=4294967296.0;
  e=(c/d)*(b-a)+a;
  printf("%.80f\n",e);
  f=c/d;
  f*=(b-a);
  f+=a;
  printf("%.80f\n",f);
}

Note ... 32 bit x86 code can be generated with 'gcc -m32' ,thanks @Peter Cordes https://stackoverflow.com/users/224132/peter-cordes

See also

is boost::random::uniform_real_distribution supposed to be the same across processors?

--- update for user Madivad

64 bit output 
-930037765265417043968.00000...
-930037765265417043968.00000...

32 bit output
-930037765265416519680.00000...
-930037765265417043968.00000...

The "mathematically correct" output can be given by this python code

from fractions import Fraction
a=-8988465674311578540726
b=8988465674311578540726
c=1925283223
d=4294967296
print "%.80f" % float(Fraction(c,d)*(b-a)+a)

-930037765265416519680.000...
Community
  • 1
  • 1
don bright
  • 1,113
  • 1
  • 11
  • 15
  • 2
    out of curiosity, what are the results you're getting? – Madivad Nov 19 '15 at 04:07
  • 1
    "The following code will output different results for variables 'e' and 'f' on a 32 bit machine but the same results on a 64 bit machine." -- I'll buy that for *specific* pairs of machines / machine classes, but your assertion cannot be supported as a general rule. – John Bollinger Nov 19 '15 at 04:07
  • 4
    @don: I just finished solving your RNG question, and this has the same answer: x87 FP has 80bit internal precision, while SSE FP has 64bit precision even for temporaries. I wish you'd asked this first :P I could have saved an hour wading through the integer part of the Boost mersenne PRNG looking for differences when run in 32 vs. 64bit mode. – Peter Cordes Nov 19 '15 at 04:09
  • 1
    Also, `-m32` to make 32bit binaries doesn't "simulate" a 32bit machine. It really does make 32bit binaries that you could run on a 32bit system. That's why you need separate library packages to suppor it. Modern 64bit x86 systems have both sets of libraries installed and can transparently run i386 and amd64 binaries. (This is called "multiarch" support, and took a couple years after the initial port of Linux distros to amd64 to really settle down. But now it's just assumed, and Windows is the same way: 32 and 64bit versions of most libraries.) – Peter Cordes Nov 19 '15 at 04:13
  • Try use `gcc -Ofast`, I get the same results with `-m32` and without, because of `gcc` calc constants at compile time – fghj Nov 19 '15 at 04:18
  • 1
    @PeterCordes to complete the comment you should note that 32-bit code can use x87 instructions but 64-bit code *must* use SSE. – Mark Ransom Nov 19 '15 at 04:24
  • 1
    @MarkRansom: I went into much more detail in my answer on http://stackoverflow.com/questions/33789476/is-boostrandomuniform-real-distribution-supposed-to-be-the-same-across-proce. I guess I should have explicitly said that people should look there. :P – Peter Cordes Nov 19 '15 at 04:28
  • sorry Peter i was figuring it out simultaneously with you (took me alot longer though), thought this would be an interesting q as my original was specifically about boost – don bright Nov 19 '15 at 04:51
  • 1
    You should also note that Clang has had a bug where it did not report `FLT_EVAL_METHOD` correctly when commandline options were making it use the historical 387 instructions (http://stackoverflow.com/questions/17663780/is-there-a-document-describing-how-clang-handles-excess-floating-point-precision#comment26838410_17663780 ) and that the code it produces in this case does not conform to the description of `FLT_EVAL_METHOD=2` anyway (http://blog.frama-c.com/index.php?post/2013/07/24/More-on-FLT_EVAL_METHOD_2 ) – Pascal Cuoq Nov 19 '15 at 10:26

1 Answers1

10

FLT_EVAL_METHOD.

C allows intermediate FP calculations to occur at higher/wider types depending on FLT_EVAL_METHOD. So when wider types are used and code flow differs, though mathematically equal, slightly different results may occur.


Except for assignment and cast (which remove all extra range and precision), the values yielded by operators with floating operands and values subject to the usual arithmetic conversions and of floating constants are evaluated to a format whose range and precision may be greater than required by the type. The use of evaluation formats is characterized by the implementation-defined value of FLT_EVAL_METHOD:

-1. indeterminable;
0. evaluate all operations and constants just to the range and precision of the type;
1. evaluate operations and constants of type float and double to the range and precision of the double type, evaluate long double operations and constants to the range and precision of the long double type;
2. evaluate all operations and constants to the range and precision of the long double type.
C11dr §5.2.4.2.2 9

[Edit]

@Pascal Cuoq has a useful comment on the veracity on FLT_EVAL_METHOD. In any case, FP code, optimized different along various code paths, may present different results. This may occur when FLT_EVAL_METHOD != 0 or compiler is not strictly conforming.

Concerning a detail of the post: the operation X*Y + Z done in 2 operations of * and then + could be contrasted with fma() which "compute (x × y) + z, rounded as one ternary operation: they compute the value (as if) to infinite precision and round once to the result format, according to the current rounding mode." C11 §7.12.13.1 2. Another candidate for the difference in results could be due to the application "fma" to the line e=(c/d)*(b-a)+a;

Community
  • 1
  • 1
chux - Reinstate Monica
  • 143,097
  • 13
  • 135
  • 256