Equality of floating point numbers after storing/loading/moving

Question

A coworker and I have a disagreement about what can happen when comparing two floating-point numbers that have not been mathematically operated on. i.e. the numbers may have been moved around memory and/or cpu registers, but no math has been done on them. Maybe they have been put in a list and then removed or other various operations.

My experience has led me to believe that doing non-arithmetic operations on floating-point numbers should never change them or be subject to the same rounding errors as arithmetic operations. My coworker contents that the floating-point processing part of the cpu for some modern architectures is allowed to slightly corrupt the number such that equality checks fail, even when only storing/loading/moving the value.

For example, consider this C code:

float* a = (float*)malloc(sizeof(float));
float* b = (float*)malloc(sizeof(float));
*a = 1.0;
*b = 1.0;
int equal = *a == *b;

Are there any circumstances where equal would not be 1?

I don't know of any architecture where the bit pattern could change, Still, have you considered NaN's? Generally, when a = NaN, then a != a, I think, because the nature of floating point equality. If you compare bit pattern even NaN's should appear equal. — Erik Eidt, May 30 '23 at 20:15
@ErikEidt There *are* multiple bit patterns that correspond to NaN, but I'm more concerned with non-NaNs here. — Kai Schmidt, May 30 '23 at 20:18
`1.0` specifically is the most round number, all mantissa bits clear, so no rounding could change it. FPUs don't randomly corrupt numbers, they just sometimes introduce rounding to lower precision than temporaries (or really, keep more precision than C rules demand, unless you use `gcc -ffloat-store` or something like `gcc -std=c11` instead of the default `gnu11`.) See also https://randomascii.wordpress.com/2012/03/21/intermediate-floating-point-precision/ and https://randomascii.wordpress.com/2012/02/25/comparing-floating-point-numbers-2012-edition/ — Peter Cordes, May 30 '23 at 20:21
A more interesting case might be `*a = 3.141592653589793` which has to round that `double` literal to a `float`. If you then did `*a == 3.141592653589793` it could be false, because the left side would promote `float` to a `double` to match the right side, but with the rounding done. But depending on compiler optimizations, it might skip that step and propagate the original value, not doing the rounding so it still compared equal. But in your case where both sides have been assigned to `float` objects, then `*a == *b` is a `float` comparison with the types already matching and both rounded — Peter Cordes, May 30 '23 at 20:24
@ErikEidt's point is that with `a = NaN`, `a == a` is false in IEEE 754 semantics. The comparison result is "unordered", none of greater, less, or equal. That's one way to implement `isnan`. IEEE FP comparisons aren't based on bitwise equality. Equal bit-pattern NaNs are unequal, but `-0.0 == +0.0`. (In all other cases, it is the same as bitwise equality, unless your FPU is in denormals-are-zero mode, then all subnormals compare equal to zero and each other.) — Peter Cordes, May 30 '23 at 20:29
Also, the only mainstream FPU architecture I'm aware of that munges FP data in surprising ways is x87, which is not modern. Modern x86 CPUs all support SSE2, and x86-64 uses that for FP math except for `long double` on ISAs that support it as an 80-bit type. Most modern compilers are configured to use SSE2 for scalar math in 32-bit builds as well, assuming that they don't need to support computers from before the early 2000s. x87 is usually `FLT_EVAL_METHOD == 2` (https://en.cppreference.com/w/cpp/types/climits/FLT_EVAL_METHOD) or a sloppy version of that, while most other ISAs are `0` — Peter Cordes, May 30 '23 at 20:32
@KaiSchmidt, consider a = NaN, then a != a, so that means same bit pattern NaN != itself, except if you use integer compare on bit pattern then some NaN should equal itself. — Erik Eidt, May 30 '23 at 20:33
I wouldn't dare assume they are sufficiently unchanged in a non-typesafe language which might do implicit promotions to other types. I can't give an example where it can happen, but I'm just saying I wouldn't make that assumption. — Simon Goater, May 30 '23 at 21:00
"Are there any circumstances where equal would not be 1?" --> No. yet you are using a simplistic case. — chux - Reinstate Monica, May 31 '23 at 04:45
Related (not really a duplicate): https://stackoverflow.com/questions/59710531/if-i-copy-a-float-to-another-variable-will-they-be-equal/59732251#59732251 — chtz, May 31 '23 at 07:51
Floating-point operations can be surprising or difficult to understand, but they are never random. No properly-designed floating-point unit will ever "slightly corrupt" a value for no reason. The only time a value can change simply due to having been moved around is if it's moved to an object of lower precision. This *can* be a problem, especially if you didn't realize it — for example, if you didn't realize that the source object was of higher-than-usual precision. — Steve Summit, May 31 '23 at 11:14
@SteveSummit I agree that storing/loading cannot change a floating-point value (unlike integers with signed zero), but here conversions are also involved (possibly when you implied in your last sentence). See [my answer](https://stackoverflow.com/questions/76368223/equality-of-floating-point-numbers-after-storing-loading-moving/76389573#76389573) for details. — vinc17, Jun 02 '23 at 11:24

vinc17 · Accepted Answer · 2023-06-02T11:09:30.490

What you wrote is equivalent to

float a = 1.0;
float b = 1.0;
int equal = a == b;

(using pointers do not change anything as far as the C standard is concerned). So, for the variable a, 1.0 is interpreted in some evaluation format F (depending on FLT_EVAL_METHOD, see ISO C17 5.2.4.2.2p9), then converted to float and stored in a. Ditto for b. As a general rule, storing/reading values must not change them, unless explicitly stated (for instance, the ISO C17 standard explicitly says in 6.2.6.2p3 that on implementations that support negative integer zeros, a negative zero may become a normal zero when stored).

To answer the question, first consider the conversion of the 1.0 constant (as a string) to F on both lines. ISO C17 says in 6.4.4.2p5: "All floating constants of the same source form shall convert to the same internal format with the same value." So you'll get the same value (in the evaluation format F) in both cases. But note that if you had 1.0 and 1.00 respectively, you might get different values (unlikely, in particular because 1.0 is exactly representable and simple enough, but not forbidden by the C standard).

Then consider the conversion of the obtained value to float. ISO C17 6.3.1.5p1 says: "When a value of real floating type is converted to a real floating type, if the value being converted can be represented exactly in the new type, it is unchanged." This is the case of the value 1 if 1.0 (used in the example) was converted to 1, so that equal will be 1 in this case. But if 1.0 was converted to some other value, not representable in a float, I think that equal could be 0 (the C standard does not require that conversion of some value to some type always yield the same result, and note that in particular, this is not the case when the rounding mode changes).

Thanks, this gets to the heart of what I wanted to know. – Kai Schmidt Jun 02 '23 at 13:57 — Kai Schmidt, Jun 02 '23 at 13:57

score 2 · Answer 2 · answered Jun 02 '23 at 10:48

Your coworker’s belief might have origins in C and C++ rules.

C and C++ allow floating-point expressions to be evaluated using more precision than the nominal types of the operands.

For example, given all float variables, d = a + b - c; may be evaluated using double arithmetic. The rules require that, when the result is stored in d, it be converted back to the nominal type float. However, the intermediate calculations can use double. For example, if a is 2³⁰, b is 1, and c is 2³⁰, this expression would produce 0 if computed using float arithmetic, because adding 2³⁰ and 1 would produce 2³⁰ (because 2³⁰+1 is not representable in the format commonly used for float, so it is rounded to a representable value), and then subtracting 2³⁰ yields 0. However, if double arithmetic is used, then the addition produces 2³⁰+1, and subtracting 2³⁰ yields 1.

This is not dissimilar from the fact that, given all unsigned char variables, an expression is evaluated using int arithmetic and may produce results different than if it were evaluated using only unsigned char arithmetic. For example, with a = 100, b = 3, d = 3 * a / b ; will yield 100 since int arithmetic is used. If unsigned char arithmetic were used, 3 * a would produce 44 (300−256), and dividing by 3 would produce 14. So it is not solely a floating-point issue; it is an issue about how the programming language evaluates expressions.

The C and C++ standards also require that casts round their operands to the target type, as well as assignments.

This license by the standards does not allow implementations to change floating-point values that are merely copied, including by an assignment to the same type that performs no arithmetic operation.

It may also be evaluated using `long double` arithmetic; that's `FLT_EVAL_METHOD == 2` (https://en.cppreference.com/w/cpp/types/climits/FLT_EVAL_METHOD), which is what you get on 32-bit x86 if the x87 precision-control bits are left at the default full 64-bit mantissa precision, rather than rounding to 53-bit mantissas like `double`. https://randomascii.wordpress.com/2012/03/21/intermediate-floating-point-precision/ has some info about historical MSVC behaviour for x87. Fortunately most modern ISAs (including x86-64) can efficiently do `FLT_EVAL_METHOD == 0` (eval as the C type). — Peter Cordes, Jun 02 '23 at 20:25

Sam Mason · Answer 3 · 2023-06-01T22:03:59.650

I'm assuming you're referring to IEEE754 binary floating point numbers, as these would be used by most conventional CPUs these days.

In that case, then yes just moving these around would not cause the values to change. When doing operations on the values, it sounds like your intuition might need clarification. Specifically, the result of any operation should be the same as if the FPU performed the operation at full (i.e. arbitrary) precision and then only rounded the result to fit into the appropriate (e.g. 32bit binary float). Hence, for a CPU that implements IEEE754 floats, any rounding that occurs will be deterministic–this is the reason behind the 0.1 + 0.2 != 0.3 statement.

Note that by using C you've complicated this via two mechanisms; 1. C doesn't require IEEE754 semantics; 2. C will automatically convert between float and double values (i.e. IEEE754 32bit and 64bit binary floats on common architectures). These two features can mean that code might not do what a naive reader expects.

As a more complete example, I put the following code into Godbolt's compiler explorer:

#include <math.h>

int explicit_nan(float * restrict a, float * restrict b, float val) {
    *a = val;
    *b = val;
    return isnan(val) ? 1 : *a == *b;
}

int implicit_nan(float * restrict a, float * restrict b, float val) {
    *a = val;
    *b = val;
    return *a == *b;
}

The restrict keyword is needed otherwise Clang compiles the code defensively by assuming it might be called like:

char data[sizeof(float) + 1];
explicit_nan((float*)(data), (float*)(data+1), 1.f);

Which causes the write to b to change the value of a. Presumably you don't care about this case, but the C compiler can't be so lenient.

The explicit_nan code can be seen to compile to code that always returns 1 (by setting EAX), while the implicit_nan code only does the comparison on the passed parameter so that NaNs are handled correctly in a small amount of code.

C specifies that rounding (of over-precise temporaries if any https://randomascii.wordpress.com/2012/03/21/intermediate-floating-point-precision/) to the actual C type (`float` or `double`) happens at expression boundaries, but in practice compilers are sloppy about that. e.g. GCC targeting 32-bit x86 with x87 FP math keeps extra precision across statements when optimizing, unless you use `gcc -ffloat-store` or `gcc -std=c99` or `c11` or whatever (instead of the default `gnu99`). So this answer is a bit optimistic in its assumptions. — Peter Cordes, Jun 01 '23 at 21:03
But yes, just copying data around doesn't round change it if it can already be represented as a `float`. Round-tripping a `double` or `long double` to `float` might change the value if it wasn't already a round number. — Peter Cordes, Jun 01 '23 at 21:05
@PeterCordes sorry, I've not had to think about an x86 FPU's 80bit floats for a while! I'm also interpreting OP's "modern architectures" comment to mean x86-64/ARM64 (and maybe RISC-V/Power) rather than non-754 DSPs but the question is somewhat vague — Sam Mason, Jun 01 '23 at 22:11
Yeah, as discussed in comments under the question, all modern architectures have FLT_EVAL_METHOD == 0, using stuff like SSE2 for scalar math, not x87, making it a near-total non-issue except when mixing different types in the C source. But I think the kind of effects the OP and their colleague have heard about are the ones caused by x87's FLT_EVAL_METHOD == 2. — Peter Cordes, Jun 01 '23 at 23:07

Equality of floating point numbers after storing/loading/moving

3 Answers3