Why would the same code yield different numeric results on 32 vs 64-bit machines?

Question

We are working on a library of numeric routines in C. We are not sure yet whether we will work with single precision (float) or double (double), so we've defined a type SP as an alias until we decide:

typedef float SP;

When we run our unit tests, they all pass on my machine (a 64-bit Ubuntu) but they fail on my colleague's (a 32-bit Ubuntu that was mistakenly installed on a 64-bit machine).

Using Git's bisect command, we found the exact diff that began yielding different results between his machine and mine:

-typedef double SP;
+typedef float SP;

In other words, going from double precision to single precision yields numerically different results on our machines (about 1e-3 relative difference in the worst cases).

We are fairly certain that we are never comparing unsigned ints to negative signed ints anywhere.

Why would a library of numeric routines yield different results on a 32-bit operating system and on a 64-bit system?

CLARIFICATION

I'm afraid I might not have been clear enough: we have Git commit 2f3f671 that uses double precision, and where the unit tests pass equally well on both machines. Then we have Git commit 46f2ba, where we changed to single precision, and here the tests still pass on the 64-bit machine but not on the 32-bit machine.

Are you sure it's the bittage of the OS and not hardware differences? — flipchart, Oct 21 '11 at 09:13
Likely because of this: http://stackoverflow.com/questions/1076190/64-bit-floating-point-porting-issues , (the 32 bit code is using the 387 coprocessor, the 64 bit code might be using sse) — nos, Oct 21 '11 at 09:29
nos is the only one who got this one right. The given answers are wrong. — R.. GitHub STOP HELPING ICE, Oct 21 '11 at 09:52
In view of nos' comment you should probably look into the effect that optimization options have for the correctness of your code. Gcc has some that handle the behavior of floating point operations. Try to read the assembler that is produced and see if it differs significantly between the two architectures. — Jens Gustedt, Oct 21 '11 at 10:05

score 9 · Accepted Answer · answered Oct 21 '11 at 10:06

You are encountering what is often called the 'x87 excess-precision "bug"'.

In short: historically, (nearly) all floating-point computation on x86 processors was done using the x87 instruction set, which by default operates on an 80-bit floating-point type, but can be set to operate in either single- or double-precision (almost) by some bits in a control register.

If single-precision operations are performed while the precision of the x87 control register is set to double- or extended-precision, then the results will differ from what would be produced if the same operations were performed in single-precision (unless the compiler is extraordinarily careful and stores the result of every computation and reloads it to force rounding to occur in the correct place.)

Your code running on 32-bit is using the x87 unit for floating-point computation (apparently with the control register set for double-precision), and thus encountering the issue described above. Your code running on 64-bit is using the SSE[2,3,...] instructions for floating-point computation, which provide native single- and double-precision operations, and therefore does not carry excess-precision. This is why your results differ.

You can work around this (to a point) by telling your compiler to use SSE for floating-point computation even on 32-bit (-mfpmath=sse with GCC). Even then, bit-exact results are not guaranteed because the various libraries that you link against may use x87, or simply use different algorithms depending on the architecture.

My gosh, you're right. Adding `-mfpmath=sse` (but together with `-msse`, as per the gcc manpage) makes it pass on the 32-bit machine. Thankyouthankyouthankyou — lindelof, Oct 21 '11 at 10:26
Do you have any links to articles about why they moved from FPU to SSE? — Skizz, Oct 21 '11 at 12:08
@Skizz: No links, but "because it's better in nearly every way." SSE does not suffer from excess-precision, can round to smaller types without requiring stores, has better support for handling NaNs, is faster, does not have as many hidden stalls, supports conversions between floating-point and integer, requires less memory traffic, ... — Stephen Canon, Oct 21 '11 at 12:17

score -1 · Answer 2 · answered Oct 21 '11 at 09:25

-1

IIRC, the precision of 'double' is only required to be >= the precision of 'float'. So on one implementation the actual number of bits in 'float' and 'double' may be the same, while on the other they differ. This is probably realted to the 32-bit/64-bit difference in the platforms, but may not be.

answered Oct 21 '11 at 09:25

Max

2,121
3
16
20

What's with the downvoting?! I am referring to the C LANGUAGE definition of float & double types. These do not HAVE to be mapped to the IEEE 754 single precision & double precision types. So long as the precision of double is better than or EQUAL TO the precision of float the implementation conforms to the C specification – Max Oct 21 '11 at 10:11

Why would the same code yield different numeric results on 32 vs 64-bit machines?

2 Answers2

Linked