acos(double) gives different result on x64 and x32 Visual Studio

Question

acos(double) gives different result on x64 and x32 Visual Studio.

printf("%.30g\n", double(acosl(0.49990774364240564)));
printf("%.30g\n", acos(0.49990774364240564));

on x64: 1.0473040763868076
on x32: 1.0473040763868078

on linux4.4 x32 and x64 with sse enabled: 1.0473040763868078

is there a way to make VSx64 acos() give me 1.0473040763868078 as result?

Reproduced for me on VS2015: C:\project\ConsoleApplication1\Release>ConsoleApplication1.exe 1.04730407638680778070749965991 1.04730407638680778070749965991 C:\project\ConsoleApplication1\x64\Release>ConsoleApplication1.exe 1.04730407638680755866289473488 1.04730407638680755866289473488 — Artimosha, Aug 17 '17 at 12:20
What kind of answer are you looking for? The C++ standard allows this (only + - * / and sqrt are required to produce correctly-rounded results out to the last bit of the mantissa). So are you asking for the asm-level details of the different library implementations? Deterministic FP computation is a very hard problem, and AFAIK you generally have to avoid more complex library functions. Even then, different compilers and different architectures make it very very hard in C++. — Peter Cordes, Aug 17 '17 at 12:36
@Artimosha: why is it a problem for you? IEEE754 doesn't guarantee the most accurate result for transcendental functions. It is only guaranteed for the four basic operations and sqrt. — geza, Aug 17 '17 at 12:37
@PeterCordes: does C++ standard specify that "only + - * / and sqrt are required to produce correctly-rounded results out to the last bit of the mantissa"? — geza, Aug 17 '17 at 12:46
@PeterCordes I was guessing that acos() implementation same for x32 and x64 and was wondering if there is some tricky compiler options that can solve my problem. But if there is no garanty that they should work same, then ok... — Artimosha, Aug 17 '17 at 12:50
@geza: IEEE-754 does, so C++ implementations that use IEEE floating point do. See https://en.wikipedia.org/wiki/Rounding#Table-maker.27s_dilemma for some links and a mention of the IEEE "basic" operations. — Peter Cordes, Aug 17 '17 at 12:51
@PeterCordes: okay, it can be misunderstood from your previous comment :) — geza, Aug 17 '17 at 12:53
@Artimosha: if you use sse for x32, it is strange that they differ, though. If you want reproducible transcendentals (considering IEEE754 only), you have to write/use a library for this. This could be a good solution, because it shouldn't be much slower than the SSE implementation that the compiler gives you (if the library is good). — geza, Aug 17 '17 at 12:56
@geza: Actually I was being sloppy, so good point that C++ doesn't even require IEEE FP at all (if that's still true). IIRC, it does have some optional sections that apply to implementations that do implement IEEE FP, though. — Peter Cordes, Aug 17 '17 at 12:56
It's possibly you could find an option for 32-bit mode to make it match 64-bit mode (or [set the x87 precision mode to 64-bit instead of the default 80](https://randomascii.wordpress.com/2012/03/21/intermediate-floating-point-precision/)), if 32-bit still uses an x87 library function. (Unless there's a separate SSE library for 32-bit, then enabling 32-bit SSE2 code-gen won't change the code inside the library). Getting 64-bit to be different is unlikely, though. See that link for Bruce Dawson's *excellent* series of articles about floating point. — Peter Cordes, Aug 17 '17 at 13:00
Related: https://stackoverflow.com/questions/33887108/stdpow-produce-different-result-in-32-bit-and-64-bit-application — Gabriel Devillers, Aug 25 '22 at 13:36
Similar behavior with std::sin : `std::sin(-2.3561944901923457)` gives `-0.7071067811865469` in 32 bits and `-0.707106781186547` in 64 bits, with VS2019. — Gabriel Devillers, Aug 25 '22 at 14:25

score 3 · Accepted Answer · edited Jun 20 '20 at 09:12

TL:DR: this is normal and you can't reasonably change it.

The 32-bit library may be using 80-bit FP values in x87 registers for its temporaries, avoiding rounding off to 64-bit double after every operation. (Unless there's a whole separate library, compiling your own code to use SSE doesn't change what's inside the library, or even the calling convention for passing data to the library. But since 32-bit passes double and float in memory on the stack, a library is free to load it with SSE2 or with x87. Still, you don't get the performance advantage of passing FP values in xmm registers unless it's impossible for non-SSE code to use the library.)

It's also possible that they're different simply because they use a different order of operations, producing different temporaries along the way. That's less plausible, unless they're separately hand-written in asm. If they're built from the same C source (without "unsafe" FP optimizations), then the compiler isn't allowed to reorder things, because of this non-associative behaviour of FP math.

glibc's libm (used on Linux) typically favours precision over speed, so its giving you the correctly-rounded result out to the last bit of the mantissa for both 32 and 64-bit. The IEEE FP standard only requires the basic operations (+ - * / FMA and FP remainder) to be "correctly rounded" out to the last bit of the mantissa. (i.e. rounding error of at most 0.5 ulp). (The exact result, according to calc, is 1.047304076386807714.... Keep in mind that double (on x86 with normal compilers) is IEEE754 binary64, so internally the mantissa and exponent are in base2. If you print enough extra decimal digits, though, you can tell that ...7714 should round up to ...78, although really you should print more digits in case they're not zero beyond that. I'm just assuming it's ...78000.)

So Microsoft's 64-bit library implementation produces 1.0473040763868076 and there's pretty much nothing you can do about it, other than not use it. (e.g. find your own acos() implementation and use it.) But FP determinism is hard, even if you limit yourself to just x86 with SSE. See Does any floating point-intensive code produce bit-exact results in any x86-based architecture?. If you limit yourself to a single compiler, it can be possible if you avoid complicated library functions like acos().

You might be able to get the 32-bit library version to produce the same value as the 64-bit version, if it uses x87 and changing the x87 precision setting affects it. But the other way around is not possible: SSE2 has separate instructions for 64-bit double and 32-bit float, and always rounds after every instruction, so you can't change any setting that will increase precision result. (You could change the SSE rounding mode, and that will change the result, but not in a good way!)

See also:

Intermediate Floating-Point Precision and the rest of Bruce Dawson's excellent series of articles about floating point. (table of contents.

The linked article describes how some versions of VC++'s CRT runtime startup set the x87 FP register precision to 53-bit mantissa instead of 80-bit full precision. Also that D3D9 will set it to 24, so even double only has the precision of float if done with x87.
https://en.wikipedia.org/wiki/Rounding#Table-maker.27s_dilemma
What Every Computer Scientist Should Know About Floating-Point Arithmetic

score 2 · Answer 2 · answered Aug 17 '17 at 12:27

2

You may have reached the precision limit. Double precision is approximately 16 digits. After, there is no guarantee the digits are valid. If so, you cannot change this behavior, except changing the type double to something else, supporting higher precision.

E.g. long double if your machine and compiler supports the extended 80 bit double or 128 bit Quadruple (also machine depended see here for example).

answered Aug 17 '17 at 12:27

user1810087

5,146
1
41
76

2

The real issue is precision of the temporaries during the computation, which could be 80-bit for the 32-bit library function if it uses x87 instructions. The 64-bit version will only use 64-bit `double` at every step, because AMD64 includes SSE2 and it's used by default. Or simply doing the same operations in a different order could produce different results (by rounding intermediates differently), even if both versions are using SSE2. But that would only happen if the library was built with unsafe FP optimizations enabled. (FP math is not associative, and compilers / standards know this). – Peter Cordes Aug 17 '17 at 12:31
@PeterCordes I didn't know that 64-bit machines would only use 64-bit double. Good point, thank you. – user1810087 Aug 17 '17 at 12:41

score 1 · Answer 3 · answered Aug 17 '17 at 15:11

1

I disagree that there isn't much you can do about it.

For example, you could try changing the floating point model compiler options.

Here are my results with different floating point models (note /fp:precise is the default):

/fp:precise   1.04730407638680755866289473488
/fp:strict    1.04730407638680755866289473488
/fp:fast      1.04730407638680778070749965991

So it seems you are looking for /fp:fast. Whether that gives the most accurate result remains to be seen though.

answered Aug 17 '17 at 15:11

rustyx

80,671
25
200
267

My mistake; I didn't know that VC++ had different math library implementations for different FP settings. If my very brief testing with `calc` is correct, then ironically the `/fp:fast` result is correctly rounded and the others aren't, for *this* specific input value. – Peter Cordes Aug 17 '17 at 15:27

acos(double) gives different result on x64 and x32 Visual Studio

3 Answers3

Linked