9

My programe runs both in linux and windows, I have to make sure the floating point arithmetic get the same result in different OS.

Here is the code:

for (int i = 0; i < 100000; ++i)
{
    float d_value = 10.0f / float(i);
    float p_value = 0.01f * float(i) + 100.0f;
}

I use "g++ -m32 -c -static -g -O0 -ffloat-store" to build the code in linux. I use "/fp:precise /O2" to build the code in windows with vs2005.

When I printf the "d_value" and the "p_value", the "d_value" is all the same both in linux and windows. But the "p_value" is different sometimes. For exsample, print the "p_value" with hexadecimal format:

windows:  42d5d1eb
linux:    42d5d1ec

Why does this happen?

My g++ version is

Configured with: ../src/configure -v --with-pkgversion='Debian 4.4.5-8' --with-bugurl=file:///usr/share/doc/gcc-4.4/README.Bugs --enable-languages=c,c++,fortran,objc,obj-c++ --prefix=/usr --program-suffix=-4.4 --enable-shared --enable-multiarch --enable-linker-build-id --with-system-zlib --libexecdir=/usr/lib --without-included-gettext --enable-threads=posix --with-gxx-include-dir=/usr/include/c++/4.4 --libdir=/usr/lib --enable-nls --enable-clocale=gnu --enable-libstdcxx-debug --enable-objc-gc --enable-targets=all --with-arch-32=i586 --with-tune=generic --enable-checking=release --build=i486-linux-gnu --host=i486-linux-gnu --target=i486-linux-gnu
Thread model: posix
gcc version 4.4.5 (Debian 4.4.5-8)

I use the flag -ffloat-store, because of someone's suggestion here: Different math rounding behaviour between Linux, Mac OS X and Windows

lehins
  • 9,642
  • 2
  • 35
  • 49
hdbean
  • 183
  • 1
  • 7
  • What about using [MPFR](http://www.mpfr.org/) ? – hek2mgl May 06 '13 at 09:21
  • For which value of `i` do you observe the given output? – Basile Starynkevitch May 06 '13 at 10:35
  • the language and the compiler are not all that are in control here and basically an approach like this isnt going to result in matching results. For example, the parser is going to use some C or other library to convert the string of numbers and a decimal point into a binary floating point number. those libraries vary in particular across operating systems, compilers, etc. So even if the code were identical the values fed the code may vary and as a result the output varies. IEEE754 is also nasty with respect to rounding. – old_timer May 30 '14 at 02:34
  • and as mentioned already the rules of the language or perhaps implementation defined options can result in conversions up and down in precision which affect the results. Stack overflow is riddled with why isnt this floating point less than working, answer because the compiler did a precision conversion and as a result of that and rounding the value is not less than – old_timer May 30 '14 at 02:35

2 Answers2

8

Use /fp:strict on Windows to tell the compiler to produce code that strictly follows IEEE 754, and gcc -msse2 -mfpmath=sse on Linux to obtain the same behavior there.

The reasons for the differences you are seeing have been discussed in spots on StackOverflow, but the best survey is David Monniaux's article.


The assembly instructions I obtain when compiling with gcc -msse2 -mpfmath=sse are as follow. Instructions cvtsi2ssq, divss, mulss, addss are the correct instructions to use, and they result in a program where p_value contains at one point 42d5d1ec.

    .globl  _main
    .align  4, 0x90
_main:                                  ## @main
    .cfi_startproc
## BB#0:
    pushq   %rbp
Ltmp2:
    .cfi_def_cfa_offset 16
Ltmp3:
    .cfi_offset %rbp, -16
    movq    %rsp, %rbp
Ltmp4:
    .cfi_def_cfa_register %rbp
    subq    $32, %rsp
    movl    $0, -4(%rbp)
    movl    $0, -8(%rbp)
LBB0_1:                                 ## =>This Inner Loop Header: Depth=1
    cmpl    $100000, -8(%rbp)       ## imm = 0x186A0
    jge LBB0_4
## BB#2:                                ##   in Loop: Header=BB0_1 Depth=1
    movq    _p_value@GOTPCREL(%rip), %rax
    movabsq $100, %rcx
    cvtsi2ssq   %rcx, %xmm0
    movss   LCPI0_0(%rip), %xmm1
    movabsq $10, %rcx
    cvtsi2ssq   %rcx, %xmm2
    cvtsi2ss    -8(%rbp), %xmm3
    divss   %xmm3, %xmm2
    movss   %xmm2, -12(%rbp)
    cvtsi2ss    -8(%rbp), %xmm2
    mulss   %xmm2, %xmm1
    addss   %xmm0, %xmm1
    movss   %xmm1, (%rax)
    movl    (%rax), %edx
    movl    %edx, -16(%rbp)
    leaq    L_.str(%rip), %rdi
    movl    -16(%rbp), %esi
    movb    $0, %al
    callq   _printf
    movl    %eax, -20(%rbp)         ## 4-byte Spill
## BB#3:                                ##   in Loop: Header=BB0_1 Depth=1
    movl    -8(%rbp), %eax
    addl    $1, %eax
    movl    %eax, -8(%rbp)
    jmp LBB0_1
LBB0_4:
    movl    -4(%rbp), %eax
    addq    $32, %rsp
    popq    %rbp
    ret
Pascal Cuoq
  • 79,187
  • 7
  • 161
  • 281
  • Thanks for your answer.But it does't work. The different still happens. My g++ version is "gcc version 4.4.5 (Debian 4.4.5-8)". – hdbean May 06 '13 at 09:42
  • Maybe upgrading your GCC could help. Current GCC version is 4.8. – Basile Starynkevitch May 06 '13 at 10:14
  • 1
    @hdbean `42d5d1ec` is the correct value to obtain as one of the values of variable `p_value`. I checked this by reading the assembly generated by my own GCC and making sure it was using the right instructions. Is your Visual compiler generating SSE2 instructions for floating-point? It is so difficult and expensive to generate the exact right computation without these instructions that there is no chance it would generate the right computation without using them (as explained in the article my answer links to). – Pascal Cuoq May 06 '13 at 11:55
  • @Pascal Cuoq How to make sure my vs2005 compiler generating SSE2 instructions for floating-point? I have used "/fp:strict" to build the code in my vs2005. – hdbean May 07 '13 at 01:42
  • @Pascal Cuoq Here is my vs2005 command line: '''/O2 /D "WIN32" /D "NDEBUG" /D "_CONSOLE" /D "_UNICODE" /D "UNICODE" /FD /EHsc /MT /fp:strict /Yu"stdafx.h" /Fp"Release\math_clac_test.pch" /Fo"Release\\" /Fd"Release\vc80.pdb" /W4 /nologo /c /Wp64 /Zi /TP /errorReport:prompt'''. Is there any problem? – hdbean May 07 '13 at 02:30
  • @hdbean I think this question should tell you how to obtain the generated assembly: http://stackoverflow.com/questions/34635/how-do-i-get-the-assembler-output-from-a-c-file-in-vs2005 . I do not use VS2005 myself and cannot reproduce the issue because of this, but I will add the assembly I got with GCC so that you can tell if the instructions used are of the same kind. – Pascal Cuoq May 07 '13 at 06:01
  • @hdbean Googling for words related to your question I also found this question about turning on the generation of SSE2 instructions in VS200?: http://stackoverflow.com/questions/1480916/how-do-i-enable-the-sse-sse2-instruction-set-in-visual-studio-2008-using-cmake – Pascal Cuoq May 07 '13 at 06:07
  • @Pascal Cuoq Thanks for your help. For some reason, My programe can't use sse. So I have to find another way. Fortunately, it seems OK to use `__control87_2` to set floating-point precision to 24 bits in windows. In linux, it still use `ffloat-store` flag with gcc. – hdbean May 08 '13 at 12:20
  • @hdbean If this answer helped you, you can accept it (click on the green checkmark) to mark it as useful for future readers who have a similar problem. – Pascal Cuoq May 08 '13 at 12:40
  • What do you mean by "... mulss, addss are the correct instructions to use"? The IEEE and C/C++ standards do not specify the intermediate precision so other choices (double precision math) are quite legal. That is the source of the discrepancy. – Bruce Dawson May 30 '14 at 02:28
  • @BruceDawson Actually, C99 specifies strictly the precision of intermediate results with a compiler-defined macro `FLT_EVAL_METHOD`, defined in a header `float.h` that C++03 incorporates the definition of, so you could say that C++ specifies the intermediate precision too in the same way as C99. The word “correct” in this sentence should be interpreted as “correct for a compilation platform that offers IEEE 754 single-precision for `float`” and defines `FLT_EVAL_METHOD` as 0, which I think it clear from the context. – Pascal Cuoq May 30 '14 at 05:59
  • @PascalCuoq Thanks for the pointer to FLT_EVAL_METHOD. My understanding of FLT_EVAL_METHOD is that it documents what the compiler will do for intermediate precision. That is helpful but it still means that can give different results, documented with FLT_EVAL_METHOD, and still be correct. Why do you say that FLT_EVAL_METHOD==0 is a clear assumption from the context? Is it the recommended setting? Wikipedia says that gcc defaults to 2 for x86, and 0 for x86_64, and that matches my experience. I believe VC++ used 1 for x86, and 1 or 0 for 64-bit depending on the compiler version (0 for VS 2012+) – Bruce Dawson May 31 '14 at 19:31
  • @BruceDawson `mullss` and `addss` are the correct instructions to use in the context of this answer, for “code that strictly follows IEEE 754”. The use of “strictly” in that sense could be argued against, but then I did not get to choose the values of option `/fp:` in Visual Studio, Microsoft developers did, and `strict` is the word they chose to mean this. – Pascal Cuoq May 31 '14 at 22:49
2

The precise results of your code are not fully defined by the IEEE and C/C++ standards. That is the source of the problem.

The main problem is that while all of your inputs are floats that does not mean that the calculation must be done at float precision. The compiler can decide to use double-precision for all intermediate values if it wants to. This tends to happen automatically when compiling for x87 FPUs, but the compiler (VC++ 2010, for instance) can do this expansion explicitly if it wants to even when compiling SSE code.

This is not well understood. I shared my understanding of this a few years ago here:

http://randomascii.wordpress.com/2012/03/21/intermediate-floating-point-precision/

Some compilers let you specify the intermediate precision. If you can force all compilers to use the same intermediate precision then your results should be consistent.

Bruce Dawson
  • 3,284
  • 29
  • 38
  • I provided further information in that question where the problem of excess precision arises: http://stackoverflow.com/a/23518798/139746 . As I said there, “C99 allows extra precision for floating-point expressions, which has been in the past wrongly interpreted by compiler makers as a license to make floating-point behavior erratic”. This question specifically refers to VS2005, which is not C99-compliant, so the only viable strategy is to take what one can get and run. What one can get is a “FLT_EVAL_METHOD=0" mode. Forget about making any other intermediate precision work, it won't. – Pascal Cuoq May 30 '14 at 08:14
  • For instance C99 requires that decimal floating-point constants are evaluated to the intermediate precision, not to the precision of the type. This is what GCC does because this is what the standard requires, but Clang does not do this (Clang developers do not care much for the FLT_EVAL_METHOD=2 mode). I also doubt that this is what VS2005 does (but I don't have this compiler). – Pascal Cuoq May 30 '14 at 08:18
  • In fact I would be delighted to have the information about Microsoft compilers to complement this blog post, if you have access to some of them: http://blog.frama-c.com/index.php?post/2013/07/24/More-on-FLT_EVAL_METHOD_2 – Pascal Cuoq May 30 '14 at 08:34