10

Sample Code

#include "stdio.h"
#include <stdint.h>

int main()
{
    double d1 = 210.01;
    uint32_t m = 1000;
    uint32_t v1 = (uint32_t) (d1 * m);

    printf("%d",v1);
    return 0;
}

Output
1. When compiling with -m32 option (i.e gcc -g3 -m32 test.c)

/test 174 # ./a.out
210009

2. When compiling with -m64 option (i.e gcc -g3 -m64 test.c)

test 176 # ./a.out
210010

Why do I get a difference?
My understanding "was", m would be promoted to double and multiplication would be cast downward to unit32_t. Moreover, since we are using stdint type integer, we would be further removing ambiguity related to architecture etc etc.

I know something is fishy here, but not able to pin it down.

Update:
Just to clarify (for one of the comment), the above behavior is seen for both gcc and g++.

Rob
  • 14,746
  • 28
  • 47
  • 65
kumar_m_kiran
  • 3,982
  • 4
  • 47
  • 72
  • 2
    If you call gcc like that, it is compiled as C code (which it appears to be), not C++. Don't add tags for unrelated languages. – too honest for this site Apr 06 '16 at 17:28
  • @Olaf: Ok. I will update the question. It occurs for both gcc and g++ compiler. So it is not specific to c/c++. Thanks! – kumar_m_kiran Apr 06 '16 at 17:31
  • 1) Don't make this a C++ question! C and C++ are **different** languages. 1a) You should use different code in C++ with iostreams and C++ cast operators. 2) You use the wrong type-specifier in the format string. Use `inttypes.h` macros to print a fixed width integer. – too honest for this site Apr 06 '16 at 17:33
  • 3
    Possible duplicate of [Is floating point math broken?](http://stackoverflow.com/questions/588004/is-floating-point-math-broken) – too honest for this site Apr 06 '16 at 17:35
  • 1
    @Olaf, close, but not a duplicate - it doesn't address the question of m32 and m64 affecting result. For example, I have no idea why it is happening. – SergeyA Apr 06 '16 at 17:37
  • Hint: Pick a paper and write down the **exact** binary representation of `210.01`. – too honest for this site Apr 06 '16 at 17:37
  • 1
    @Olaf, see above - while it certainly related to floating point representation, it does not answer the question as asked. And this question should not be downvoted, it is interesting why m* has this effect. – SergeyA Apr 06 '16 at 17:38
  • @SergeyA: It very well does. The value depends on the implementation. x64 and x86 (assuming these are used by OP, but the same applies for ARM) can be seen as different CPUs, so there might very well be differences how the FPU works. Keeping the CV, but removed the DV (still not ambivalent about this). – too honest for this site Apr 06 '16 at 17:38
  • The lesson is: if you want the nearest `int`, add (for positive values) 0.5 before truncating with a type conversion. `printf` rounds, but type conversion does not. – Weather Vane Apr 06 '16 at 17:40
  • @Olaf, sorry, it is not an answer, just a specualtion. *Different switches produce different code.*. – SergeyA Apr 06 '16 at 17:41
  • @SergeyA: That's why I made that a comment. And of course, they generate different code. – too honest for this site Apr 06 '16 at 17:42
  • But this question is not answered in the duplicate either! So it is not a duplicate. – SergeyA Apr 06 '16 at 17:43
  • @SergeyA: The dup very well does. Just because its conclusion is that float is imprecise per se and you always have to handle minor variations in the result. – too honest for this site Apr 06 '16 at 17:44
  • @Olaf, I personally would still be very interested in the real answer - why architecture switch affects floating point arithmetics. – SergeyA Apr 06 '16 at 17:46
  • @SergeyA: Why does compiling for a different CPU architecture which uses a different FPU change the behaviour slightly? I think this is very clear. – too honest for this site Apr 06 '16 at 17:49
  • It should not be related to FPU. Since double representation remains the same (I do not think compiler changes representation based on this), size of double remains the same, the only thing which is likely to change is rounding to the next representable double - up or down. I wonder why it changes. – SergeyA Apr 06 '16 at 17:53
  • 1
    I tried your code on my linux system, gcc -m32 test.c results in 210009. if i use -O0 optimization with -m32, it results in 210009.If i use -O1, -O2, or -O3 with -m32 it results in 210010. gcc (SUSE Linux x86-64) 4.3.4 [gcc-4_3-branch revision 152973]. Intel xeon cpu. – ron Apr 06 '16 at 17:57
  • What is `sizeof(double)` on each system? Isn't `double` *at least* 64 bits? – Weather Vane Apr 06 '16 at 17:59
  • [This page](https://msdn.microsoft.com/en-us/library/cc953fe1.aspx) says *"Type double is a floating point type that is larger than or equal to type float, but shorter than or equal to the size of type long double."* – Weather Vane Apr 06 '16 at 18:05
  • 1
    There is no reason to downvote or CV this question; it is not even a duplicate of the "floating point math is broken" - actually the rounding from 32-bit code does **not** conform to C99, C11 standards, and in this case the correct result from a conforming C implementation with 64-bit IEEE 754 double **is** 210010, period. – Antti Haapala -- Слава Україні Apr 06 '16 at 20:16

1 Answers1

11

I can confirm the results on my gcc (Ubuntu 5.2.1-22ubuntu2). What seems to happen is that the 32-bit unoptimized code uses 387 FPU with FMUL opcode, whereas 64-bit uses the SSE MULS opcode. (just execute gcc -S test.c with different parameters and see the assembler output). And as is well known, the 387 FPU that executes the FMUL has more than 64 bits of precision (80!) so it seems that it rounds differently here. The reason of course is that that the exact value of 64-bit IEEE double 210.01 is not that, but

 210.009999999999990905052982270717620849609375

and when you multiply by 1000, you're not actually just shifting the decimal point - after all there is no decimal point but binary point in the floating point value; so the value must be rounded. And on 64-bit doubles it is rounded up. On 80-bit 387 FPU registers, the calculation is more precise, and it ends up being rounded down.

After reading about this a bit more, I believe the result generated by gcc on 32-bit arch is not standard conforming. Thus if you force the standard to C99 or C11 with -std=c99, -std=c11, you will get the correct result

% gcc -m32 -std=c11 test.c; ./a.out
210010

If you do not want to force C99 or C11 standard, you could also use the -fexcess-precision=standard switch.


However fun does not stop here.

% gcc -m32 test.c; ./a.out
210009
% gcc -m32 -O3 test.c; ./a.out
210010

So you get the "correct" result if you compile with -O3; this is of course because the 64-bit compiler uses the 64-bit SSE math to constant-fold the calculation.


To confirm that extra precision affects it, you can use a long double:

#include "stdio.h"
#include <stdint.h>

int main()
{
    long double d1 = 210.01;  // double constant to long double!
    uint32_t m = 1000;
    uint32_t v1 = (uint32_t) (d1 * m);

    printf("%d",v1);
    return 0;
}

Now even -m64 rounds it to 210009.

% gcc -m64 test.c; ./a.out
210009