Extra Precision
This is false. The answer you link to states that “floating-point” is deterministic and clarifies that “The same floating-point operations, run on the same hardware, always produces the same result.” However, the C and C++ standards do not require C and C++ implementations to always use “The same floating-point operations.” So float
arithmetic is not deterministic in this sense.
A += B
should behave like A = A + B
when A has no side-effects.
Yes, A += B
behaves like A = A + B
except that the lvalue A
is evaluated only once. However, the C standard does not require that A = A + B
have a deterministic behavior. In particular, A = A + B*C;
is allowed to give two different results in two different instances.
Specifically, C 2018 5.2.4.2.2 10 says:
… the values yielded by operators with floating operands … and of floating constants are evaluated to a format whose range and precision may be greater than required by the type.
Consider target[i] += g * src[i];
. The C implementation may implement this by performing a float
multiplication of g
by src[i]
and a float
addition of that product to target[i]
. However, due to the permission above, it is also allowed to implement this by performing a double
multiplication of g
by src[i]
and a double
addition of that product to target[i]
. Or it could implement it with an effectively infinitely precise multiplication and an infinitely precise addition.
The standard leaves it up to the implementation to make this choice. It provides a method for the implementation to report some information about the choice it makes; the value of FLT_EVAL_METHOD
defined in <float.h>
is one of:
- 0, meaning all floating-point operations are performed in their nominal types,
- 1, meaning all
float
and double
operations are performed in double
, and long double
is performed in long double
,
- 2, meaning all floating-point operations are performed in
long double
, or
- −1, meaning the implementation does not assert how floating-point operations are performed.
In your case, it appears the compiler used a fused multiply-add instruction for target[i] += g * src[i];
, which is equivalent to performing the operations with infinitely precise arithmetic and rounding the result to float
.
Note that the extended precision cannot be carried indefinitely. A fuller quote from 5.2.4.2.2 10 is:
Except for assignment and cast (which remove all extra range and precision), the values yielded by operators with floating operands and values subject to the usual arithmetic conversions and of floating constants are evaluated to a format whose range and precision may be greater than required by the type.
Thus, whenever an assignment or cast is performed, the value must be converted to its nominal type.
Thus, for your float x = g * src[i];
followed by a + x
, the compiler cannot use a fused multiply-add, because the g * src[i]
product must be converted to the float
precision when assigned to x
; the extra precision cannot be carried through to a + x
.
The C++ standard has substantially equivalent text.
Contraction
C 2018 6.5 8 says:
A floating expression may be contracted, that is, evaluated as though it were a single operation, thereby omitting rounding errors implied by the source code and the expression evaluation method…
This means that even if FLT_EVAL_METHOD
, discussed above, is 0 and float
arithmetic is performed only with float
precision, the processor may perform a = b + c*d
using a fused multiply-add instruction, which calculates as if there were no rounding error in c*d
when it is added to b
. 6.5 8 allows other forms of contractions, but fused multiply-add and fused multiply-subtract are the most common.
C 2018 7.12.2 specifies an FP_CONTRACT
pragma, so that, if a translation unit has #include <math.h>
and #pragma STDC FP_CONTRACT OFF
, the compiler should not use contractions. (Note that getting a compiler to obey this and other rules of the C standard may require using certain switches, such as using -std=c18
with GCC or Clang to request better conformance to the standard than their default behavior.)
An alternate way to avoid contractions and extra precision is to use assignments or casts with single floating-point operations, such as:
float t0 = c*d;
a = b + t0;
or:
a = b + (float) (c*d);
Technically, the wording in the standard still permits the above to be performed with extra precision and then rounded to float
. This can cause double-rounding errors. However, it would be inefficient for a compiler to calculate a single operation using double
arithmetic and then use another instruction to round it to float
, so a compiler with optimization enabled should use only float
arithmetic to evaluate such expressions.
Also note the use of contractions forces a sort of non-determinism on the evaluation of floating-point arithmetic. In y = a*b + c*d
, the compiler can use a fused multiply-add to add one of the products without rounding, but it cannot do that for both. Which one will it choose? The answer is likely deterministic for any particular expression, but when a*b + c*d
appears as a subexpression in a larger expression, we generally cannot be sure which one the compiler will choose to incorporate with a fused multiply-add.
Lack of Accuracy
C 2018 5.2.4.2.2 7 says:
The accuracy of the floating-point operations (+
, -
, *
, /
) and of the library functions in <math.h>
and <complex.h>
that return floating-point results is implementation-defined,…
It is not clear this passage allows any form of non-determinism. Maybe it means that, while the accuracy is implementation-defined, it must be some specific accuracy that is always reproduced by the C implementation. Modern hardware largely conforms to IEEE 754, which the notable exception of handling of subnormal numbers. So this passage may be largely a nod to two things:
- When the compiler evaluates floating-point expressions at compile time, it might use more precision than at run time.
- Software implementations of floating-point arithmetic, intended to support C floating-point on hardware without floating-point instructions, might provide less than correctly rounded (the term for rounding exactly as specified by the IEEE 754 standard) accuracy.
In any case, the sentence is vague enough that we cannot be sure floating-point expressions in C will always return the same results given the same inputs.