The question is what are the exact rules for C++ for evaluating an expression in which data format that lead to this unfortunate result.
Let's inspect the line:
ul += d;
Where d
has type double
and ul
has type unsigned long
.
From 7.6.19 Assignment and compound assignment operators :
The behavior of an expression of the form E1 op= E2 is equivalent to E1 = E1 op E2 except that E1 is evaluated only once
So ul += d
is equal to ul = ul + d
.
From 7.6.6 Additive operators :
The additive operators + and - group left-to-right.
The usual arithmetic conversions are performed for operands of arithmetic or enumeration type.
So both ul
and d
are promoted in ul + d
.
From 7.4 Usual arithmetic conversions :
[...] This pattern is called the usual arithmetic conversions, which are defined as follows:
So ul
is converted to double in ul + d
.
From 7.3.11 Floating-integral conversions emphasis mine:
A prvalue of an integer type or of an unscoped enumeration type can be converted to a prvalue of a floating-point type.
The result is exact if possible.
If the value being converted is in the range of values that can be represented but the value cannot be represented exactly, it is an implementation-defined choice of either the next lower or higher representable value.
If the value being converted is outside the range of values that can be represented, the behavior is undefined.
So it is implementation defined if the value of ul
can't be represented exactly in double which value is used.
And then, after calculation, the double
result is converted back to unsigned long
in assignment to ul
, so also from Floating-integral conversions
emphasis mine:
A prvalue of a floating-point type can be converted to a prvalue of an integer type.
The conversion truncates; that is, the fractional part is discarded.
The behavior is undefined if the truncated value cannot be represented in the destination type.
The output of this code is 0 for all values n < 1024. Why?
Gcc compiler documents that it follows C99 Annex F when converting floats to integers and back, see gcc11.1.0 docs implementation defined beavior 4.6 Floating point, but I see the result in C99 Annex F is unspecified, but a floating point exception is required to be raised. The following code with function copied from cppreference feexceptflag
#include <iostream>
#include <limits>
#include <cfenv>
void show_fe_exceptions(void)
{
printf("current exceptions raised: ");
if(fetestexcept(FE_DIVBYZERO)) printf(" FE_DIVBYZERO");
if(fetestexcept(FE_INEXACT)) printf(" FE_INEXACT");
if(fetestexcept(FE_INVALID)) printf(" FE_INVALID");
if(fetestexcept(FE_OVERFLOW)) printf(" FE_OVERFLOW");
if(fetestexcept(FE_UNDERFLOW)) printf(" FE_UNDERFLOW");
if(fetestexcept(FE_ALL_EXCEPT)==0) printf(" none");
printf("\n");
}
int main(int argc, char **argv) {
unsigned long n = 10ul;
unsigned long ul = std::numeric_limits<unsigned long>::max() - n;
double d = 1.;
show_fe_exceptions();
ul += d;
show_fe_exceptions();
std::cout << ul << std::endl;
}
outputs on godbolt and confirms the exception is raised:
current exceptions raised: none
current exceptions raised: FE_INEXACT FE_INVALID
0