8

I'm trying to solve a cross-platform issue that's cropping up and I'm not sure quite how to go about it. Here's a demonstration program:

#include <cmath>
#include <cstdio>

int main()
{
    int xm = 0x3f18492a;
    float x = *(float*)&xm;
    x = (sqrt(x) + 1) / 2.0f;
    printf("%f %x\n", x, *(int*)&x);
}

The output on Windows when compiled in VS2010 is:

0.885638 3f62b92a

The output when compiled with GCC 4.8.1 (ideone.com sample) is:

0.885638 3f62b92b

These small mismatches end up ballooning into a serious problem over the course of a program that needs to run identically on multiple platforms. I'm not concerned so much about "accuracy" as that the results match each other. I tried switching the /fp mode in VS to strict from precise, but that doesn't seem to fix it.

What other avenues should I look at to make this calculation have the same result on both platforms?

UPDATE: Interestingly, if I change the code like this, it matches across the platforms:

#include <cmath>
#include <cstdio>

int main()
{
    int xm = 0x3f18492a;
    float x = *(float*)&xm;
    //x = (sqrt(x) + 1) / 2.0f;
    float y = sqrt(x);
    float z = y + 1;
    float w = z / 2.0f;
    printf("%f %x %f %x %f %x %f %x\n", x, *(int*)&x, y, *(int*)&y, z, *(int*)&z, w, *(int*)&w);
}

I'm not sure it's realistic, however, to be walking through the code and changing all floating point operations like this!

aardvarkk
  • 14,955
  • 7
  • 67
  • 96
  • 2
    I'm not sure what you actually expect. Whilst if you run exactly the same instructions, you probably do get the exact same result, in this case, I expect at least `sqrt` to be implemented by the C runtime library [or get calculated at compile-time, by a function resembling that in the C runtime library], and thus not be identical between the compilers. – Mats Petersson Aug 28 '13 at 17:26
  • possible duplicate of [Is specifying floating-point type sufficient to guarantee same results?](http://stackoverflow.com/questions/17030230/is-specifying-floating-point-type-sufficient-to-guarantee-same-results) – Eric Postpischil Aug 28 '13 at 17:28
  • So are you saying that even expecting a match to be possible is a fool's errand? For whatever reason, I thought it would be possible to get identical behavior between the two. Also, I believe calculating only a `sqrt` *does* match between the two. – aardvarkk Aug 28 '13 at 17:29
  • `sqrt` is probably implemented in the C runtime library as a sequence of floating point calculations (particularly in case of using a SSE processor, as it's not good to write code that uses both FPU and SSE instructions intermingled, and modern compilers tend to prefer SSE to FPU for general speed reasons). As I'm posting below, GCC pre-calculates a constant value as the result. – Mats Petersson Aug 28 '13 at 17:35
  • 1
    I think your goal is unrealistic. Compilers have a lot of discretion for generating code, and floating point is susceptible to the most subtle of differences. – Mark Ransom Aug 28 '13 at 17:47
  • 1
    Regarding your update: Whenever you perform an assignment or explicit conversion, the language standard requires the implementation to convert the value to the nominal type instead of retaining any extra precision it had. So this sort of step-by-step work can reduce some of the precision issues. Note, however, that it may merely reduce them rather than eliminating them. One step might still be performed by using extra precision and then rounding to the nominal precision. The extra rounding step may have a different result than doing the operation originally in the nominal precision. – Eric Postpischil Aug 28 '13 at 17:51
  • @EricPostpischil Right -- so it looks like my only choice is to remove floating point calculations from the source if I expect an exact match? – aardvarkk Aug 28 '13 at 17:55
  • 2
    @aardvarkk If you keep using `/fp:strict` and force yourself to use only `double` computations you have a chance to match the results of GCC and Visual Studio (as long as your computations do not involve inf or denormals at any step, and as long as you keep using only +, -, *, /, sqrt). Alternately, if you find the switch that makes Visual Studio emit SSE2 instructions instead of 387 ones, they probably would match then. – Pascal Cuoq Aug 28 '13 at 18:09
  • @PascalCuoq Good tip. Maybe converting to `double` types is an option to try. I'll definitely switch to `/fp:strict` also and see what kind of results I get. – aardvarkk Aug 28 '13 at 18:13
  • Your edit did not say what was the matching value. I suspect the Windows value. I think the problem is that on windows, you are compiling in C++ mode and in gcc, you are compiling in C mode. Ref my post for details. – chux - Reinstate Monica Aug 28 '13 at 18:39
  • @aardvarkk How was you code compiled in Windows, as C or C++? How about with gcc? – chux - Reinstate Monica Aug 28 '13 at 18:55
  • @chux It was a .cpp file, so it would have been compiled as C++. Same goes for GCC I assume. It seems to me in this case a lot of the problem has to do with type conversion. I think GCC was probably converting to a `double` before `sqrt`, whereas Windows was calling its own special `sqrt` function for `float` types. – aardvarkk Aug 28 '13 at 19:28
  • The file name does not drive the compilation but the compiler and its options used. Windows does has a C option buried "chest deep in the maze". I concur that gcc called a `double sqrt(double)`. Windows was certainly calling `float sqrt(float)`. That is why you have 2 different outputs. Solution: step 1: compile both as C or both as C++. There are more steps too as suggested by the many posts. – chux - Reinstate Monica Aug 28 '13 at 19:39
  • @aardvarkk: Are you using "no optimisation" - as you can see in my answer, with -O2, it optimises away the entire calculation, so no computation of `sqrt` at all. – Mats Petersson Aug 28 '13 at 20:50
  • @MatsPetersson I was checking the values on a Release build, which does use `/O2`. I think `chux` is likely correct in that a part of it is probably that GCC is performing the calculation with `double` types and VS is performing it with `float` types. – aardvarkk Aug 28 '13 at 21:23
  • @aardvarkk: Do you think that GCC or MS is more "right"? [I haven't even tried to follow the calculation]. In my code, all the calculation is done by the compiler during optimisation. Whether that is done at `float` or `double` size, is pretty hard to tell unless you happen to know how to read GCC code (not the easiest!). Does MSVC actually call a `sqrt` function? – Mats Petersson Aug 28 '13 at 21:27
  • @Mats Petersson The calculation, done in `double`, and then converted to `float` gives the same answer as in your post - same as OP's "GCC" answer. The calculation, done in entirely in `float` gives the OP's "windows" answer. – chux - Reinstate Monica Aug 29 '13 at 11:41

8 Answers8

7

Summary: This is generally not supported by compilers, you will have a tough time doing it in a higher-level language, and you will need to use one math library common to all your target platforms.

The C and C++ language standards allow implementations a considerable amount (too much) of flexibility in floating-point operations. Many C and C++ floating-operations are not required to adhere to the IEEE 754-2008 standard in the way that might be intuitive to many programmers.

Even many C and C++ implementations do not provide good support for adhering to the IEEE 754-2008 standard.

Math library implementations are a particular problem. There does not exist any normal library (commercially available or widely-used open source with a known-bounded run-time) that provides correctly rounded results for all standard math functions. (Getting the mathematics right on some of the functions is a very difficult problem.)

sqrt, however, is relatively simple and should return correctly rounded results in an library of reasonable quality. (I am unable to vouch for the Microsoft implementation.) It is more likely the particular problem in the code you show is the compiler’s choice to use varying precisions of floating-point while evaluating expressions.

There may be various switches you can use with various compilers to ask them to conform to certain rules about floating-point behavior. Those may be sufficient for getting elementary operations to perform as expected. If not, assembly language is a way to access well-defined floating-point operations. However, the behavior of library routines will be different between platforms unless you supply a common library. This includes both math library routines (such as pow) and conversions found in routines such as fprintf, fscanf, strtof. You must therefore find one well-designed implementation of each routine you rely on that is supported on all of the platforms you target. (It must be well-designed in the sense that it provides identical behavior on all platforms. Mathematically, it could be somewhat inaccurate, as long as it is within bounds tolerable for your application.)

Eric Postpischil
  • 195,579
  • 13
  • 168
  • 312
  • This is very helpful, thanks. One interesting thing to note is that if I split up the operations, the results *do* seem to match. I'll post that as an update. – aardvarkk Aug 28 '13 at 17:49
4

C allows intermediate calculation to occurs at the floating point's precision or at a higher one.

The windows result matches the GCC if all calculations occurs using only using float.

The GCC calculation obtains the different (and more accurate) result when all calculations are coded as float, but are allowed to go to double or long double for intermediate results.

So even if everything is IEEE 754 compliant, controlling the allowed intermediate calculations has an effect.

[Edit] I do not think the above really answers the OP stated issue, but is a concern for general FP issues. It is the below, that I think best explains the difference.

MS dev network sqrt

I suspect the difference was that in the windows compilation, it was in C++ mode, thus the sqrt(x) called float sqrt(float). In gcc, it was in C mode and sqrt(x1) called double sqrt(double).
If this is the case, insure C code in windows is compiled in C mode and not C++.

int main() {
  {
    volatile float f1;
    float f2;
    double d1;
    int xm = 0x3f18492a;
    f1 = *(float*) &xm;
    f2 = *(float*) &xm;
    d1 = *(float*) &xm;
    f1 = sqrtf(f1);
    f1 = f1 + 1.0f;
    f1 = f1 / 2.0f;
    printf("f1  %0.17e %a %08X\n", f1, f1, *(int*)&f1);
    f2 = (sqrt(f2) + 1) / 2.0;
    printf("f2  %0.17e %a %08X\n", f2, f2, *(int*)&f2);
    d1 = (sqrt(d1) + 1) / 2.0;
    printf("d1  %0.17e %a\n", d1, d1);
    return 0;
  }

f1  8.85637879371643066e-01 0x1.c57254p-1 3F62B92A
f2  8.85637938976287842e-01 0x1.c57256p-1 3F62B92B
d1  8.85637911452129889e-01 0x1.c572551391bc9p-1
chux - Reinstate Monica
  • 143,097
  • 13
  • 135
  • 256
4

The Visual Studio compiler tends to generate instructions that use the old x87 FPU(*), but it generates code at the beginning of the executable to set the FPU to the precision of the double format.

GCC can also generate instructions that use the old x87 FPU, but when generating x86-64 code, the default is to use SSE2. On Mac OS X, the default is to use SSE2 even in 32-bit since all Intel Macs have SSE2. When it generates instruction for the 387, GCC does not set the precision of the FPU to the double format, so that computations are made in the 80-bit double-extended format, and then rounded to double when assigned.

As a consequence:

  1. If you use only double computations, Visual Studio should generate a program that computes exactly at the precision of the type, because it is always double(**). And if on the GCC side you use -msse2 -mfpmath=sse, you can expect GCC to also generate code that computes at the precision of doubles, this time by using SSE2 instructions. The computations should match.

  2. Or if you make both GCC and Visual Studio emit SSE2 instructions, again, the computations should match. I am not familiar with Visual Studio but the switch may be /arch:SSE2.

This does not solve the problem with math libraries, which is indeed an unsolved problem. If your computations involve trigonometric or other functions, you must use the same library as part of your project on both sides. I would recommend CRlibm. Less accurate libraries are fine too as long as it's the same library, and it respects the above constraints (using only double or compiled with SSE2 on both sides).

(*) There may be a way to instruct it to generate SSE2 instructions. If you find it, use it: it will solve your particular problem.

(**) modulo exceptions for infinities and subnormals.

Pascal Cuoq
  • 79,187
  • 7
  • 161
  • 281
3

The IEEE 754 specifies that computations can be processed at a higher precision than what is stored in memory then rounded when written back to memory. This causes many issues such as the one you see. In short, the standard does not promise that the same computation carried out on all hardware will return the same answer.

If the value to be computed on is placed on a larger register then one computation is done and then the value is moved off of the register back to memory the result is truncated there. It could then be moved back on to the larger register for another computation.

On the other hand if all the computations are done on the larger register before the value is moved back to memory you will get a different result. You may have to disassemble the code to see what's happening in your case.

With float point work it's important to understand how much precision you need in the final answer and how much precision you are guaranteed to have by the precision (note the two uses of the word) of the variables you choose and never expect anymore precision than you are guaranteed.

In the end when you are comparing results (this is true of any floating point work) you can not look for exact matches, you determine the precision that you require and you check that the difference between two values is less than the precision you require.

Getting back to practicalities, the Intel processors have 80 bit registers for floating point computations that may be used even though you've specified float which is typically 32-bit (but not always).

If you want to have fun try turning on various optimizations and processor options like SSE in your compiler and see what results you get (as well as what comes out of the disassembler).

Michael Conlen
  • 1,959
  • 2
  • 17
  • 19
2

With my 4.6.3 compiler it generates this code:

main:
.LFB104:
    .cfi_startproc
    subq    $8, %rsp
    .cfi_def_cfa_offset 16
    movl    $1063434539, %esi
    movl    $.LC1, %edi
    movsd   .LC0(%rip), %xmm0
    movl    $1, %eax
    call    printf
    xorl    %eax, %eax
    addq    $8, %rsp
    .cfi_def_cfa_offset 8
    ret
    .cfi_endproc

.LC0:
    .long   1610612736
    .long   1072453413

Note that there is ZERO calculations performed in this code, just storing various constants in registers.

I haven't got a Visual stdudio compiler, so I don't know what that produces.

Mats Petersson
  • 126,704
  • 14
  • 140
  • 227
  • The code you've provided _does_ do some calculations aside from storing various constants in registers, though. I.e., `subq`, `xorl`, `addq` are calculations on data and not just bindings. Also, I believe `movsd` is using **double precision**, which may be of pertinence since that is more precision than a float allows. A float or single precision mov would be `movss` – Hawkins May 30 '18 at 17:15
1

GCC compiler implements so called strict-aliasing semantics, which relies on the fact that in both C and C++ it is generally illegal to perform type-punning through pointer conversions (with few exceptions). Your code contains multiple violations of the requirements of strict-aliasing semantics. So, it is perfectly logical to expect that combination of strict-aliasing semantics and optimizations might produce completely unexpected and seemingly illogical results in GCC (or any other compiler, for that matter).

On top of that, there would be nothing unusual in sqrt producing slightly different results in different implementations.

AnT stands with Russia
  • 312,472
  • 42
  • 525
  • 765
  • 1
    But the results are not illogical - they're within 1 ULP of each other. While aliasing might be a problem in some contexts, I don't think that's the case here. – Mark Ransom Aug 28 '13 at 18:14
1

If you have freedom to change languages, consider using Java with "strictfp". The Java language specification gives very precise rules for order of operations, rounding etc. in strictfp mode.

Exactly matching results across implementations is an objective of the Java standard for strictfp mode. It is not an objective for the C++ standards.

Patricia Shanahan
  • 25,849
  • 4
  • 38
  • 75
  • I have a question about Java with strictfp, but I should perhaps ask it as an official question: how does(did) a Java runtime targeting an Intel processor without SSE2 deal with denormals? Even when the 387 FPU is set for 53-bit precision, it keeps an oversized exponent range that (1) forces to detect underflow at each intermediate result and (2) makes it difficult to avoid double-rounding of denormals. Strategies are re-computing with emulated FP, or a permanent exponent offset along the lines of http://blog.frama-c.com/index.php?post/2013/05/09/A-63-bit-floating-point-type-for-64-bit-OCaml – Pascal Cuoq Aug 28 '13 at 19:11
  • I really should ask the above as a question. – Pascal Cuoq Aug 28 '13 at 19:13
  • @PascalCuoq In strictfp mode, and in the original definition, it did extra work to deal with the oversized exponent to get exact 64-bit IEEE 754 behavior. It had significant performance cost, so strictfp was added for programs that really need it, and the default permits the oversized exponent behavior. – Patricia Shanahan Aug 28 '13 at 19:18
  • Oh, the essay “How Java's Floating-Point Hurts Everyone Everywhere” by Kahan suddenly makes sense. I had read it but I had not realized the cost of emulating double-precision with a 387 until I read your answer. The title was still an exaggeration: not everyone was hurt, but owners of 387s were hurt doubly. – Pascal Cuoq Aug 28 '13 at 19:27
0

You want them both to be using the IEEE 754 standard.

Paul Evans
  • 27,315
  • 3
  • 37
  • 54
  • But... aren't they already using that standard? Isn't that what a `float` type is? – aardvarkk Aug 28 '13 at 17:23
  • By the looks of things, they are. Just that the last bit gets clobbered in one implementation, and not in the other. The least significant bit can be affected by for example intermediate step operations performed at higher precision and then rounded (`sqrt` is probably taking a `double` value), so the result can change there. – Mats Petersson Aug 28 '13 at 17:28