Strange behavior when casting an int to float in C

Question

I have a doubt concerning the output of the following C program. I tried to compile it using both Visual C++ 6.0 and MinGW32 (gcc 3.4.2).

#include <stdio.h>

int main() {
    int x = 2147483647;
    printf("%f\n", (float)2147483647);
    printf("%f\n", (float)x);
    return 0;
}

The output is:

2147483648.000000
2147483647.000000

My question is: why are both lines different? When you convert the integer value 2147483647 to the IEEE 754 floating-point format, it gets approximated to 2147483648.0. So, I expected that both lines would be equal to 2147483648.000000.

EDIT: The value "2147483647.000000" can't be a single-precision floating-point value, since the number 2147483647 can't be represented exactly in the IEEE 754 single-precision floating-point format without loss of precision.

Looks like compiler-specific. ideone gives equal numbers. MinGW GCC 4.5.2 gives the same result your's do. — Eugene Sh., Nov 24 '14 at 20:03
I think this is compiler dependent. For gcc 4.8.2, it is showing 2147483648.000000 in both the cases. — MrTambourineMan, Nov 24 '14 at 20:04
An optimization bug. Is it the same output on both compilers, mingw/gcc-3.4.2 and vs6? BTW: Both are **old**. — Deduplicator, Nov 24 '14 at 20:04
@Deduplicator: yes, same output (btw: in gcc 4.1.2, tbe same thing happens). But, in gcc 4.2.1 and in CompileGround's C compiler (gcc 4.8.3), which are more recent, the output is correct and both lines are equal. It really seems to be a bug in older versions, because it's inconsistent, and also because 2147483647 can't be represented exactly in single precision floating-point format. — favq, Nov 24 '14 at 20:32
Like others said, seems to be compiler/optimization bug. gcc 4.9 with `-O2` or `-O3` produces `2147483648.000000` for both and without any optimization, it produces the similar output to yours in the post. — P.P, Nov 24 '14 at 20:39
In both cases, the argument is converted from `int` to `float` (by the cast), and then from `float` to `double` (because `printf` is a variadic function). In principle, both calls should print `2147483648.000000`, given the characteristics of the implementations you're using. Some things to try: Examine the generated assembly code (the compiler might have optimized it, collapsing the two conversions into one). Try storing the two expressions `(float)2147483647` and `(float)x` into `float` objects and printing their values, and/or examining their representations. — Keith Thompson, Nov 24 '14 at 22:46
Could you test `printf("%.8f\n", (float)x)` for a more precise result? — chux - Reinstate Monica, Nov 24 '14 at 23:08
@chux: The value of `FLT_EVAL_METHOD` in this case is defined as `__FLT_EVAL_METHOD__` in `float.h`. I can't find where this `__FLT_EVAL_METHOD__` is defined, but, printing its value, I get `2`. And I tried `printf("%.8f\n", (float)x)` in gcc 3.4.2 and the output is `2147483647.00000000`. — favq, Nov 25 '14 at 12:23

chux - Reinstate Monica · Accepted Answer · 2014-11-25T17:25:34.560

12

In both cases, code seeks to convert from some integer type to float and then to double.. The double conversion occurs as it is a float value passed to a variadic function.

Check your setting of FLT_EVAL_METHOD, suspect it has a value of 1 or 2 (OP reports 2 with at least one compiler). This allows the compiler to evaluate float "... operations and constants to the range and precision" greater than float.

Your compiler optimized (float)x going directly int to double arithmetic. This is a performance improvement during run-time.

(float)2147483647 is a compile time cast and the compiler optimized for int to float to double accuracy as performance is not an issue here.

[Edit2] It is interesting that the C11 spec is more specific than the C99 spec with the addition of "Except for assignment and cast ...". This implies that C99 compilers were sometimes allowing the int to double direct conversion, without first going through float and that C11 was amended to clearly not allow skipping a cast.

With C11 formally excluding this behavior, modern compliers should not do this, but older ones, like OP's might - thus a bug by C11 standards. Unless some other C99 or C89 specification is found to say other-wise, this appears to be allowable compiler behavior.

[Edit] Taking comments together by @Keith Thompson, @tmyklebu, @Matt McNabb, the compiler, even with a non-zero FLT_EVAL_METHOD, should be expected to produce 2147483648.0.... Thus either a compiler optimization flag is explicitly over-riding correct behavior or the compiler has a corner bug.

C99dr §5.2.4.2.2 8 The values of operations with floating operands and values subject to the usual arithmetic conversions and of floating constants are evaluated to a format whose range and precision may be greater than required by the type. The use of evaluation formats is characterized by the implementation-defined value of FLT_EVAL_METHOD:

-1 indeterminable;

0 evaluate all operations and constants just to the range and precision of the type;

1 evaluate operations and constants of type float and double to the range and precision of the double type, evaluate long double operations and constants to the range and precision of the long double type`;

2 evaluate all operations and constants to the range and precision of the long double type.

C11dr §5.2.4.2.2 9 Except for assignment and cast (which remove all extra range and precision), the values yielded by operators with floating operands and values subject to the usual arithmetic conversions and of floating constants are evaluated to a format whose range and precision may be greater than required by the type. The use of evaluation formats is characterized by the implementation-defined value of FLT_EVAL_METHOD

-1 (Same as C99)

0 (Same as C99)

1 (Same as C99)

2 (Same as C99)

edited Nov 25 '14 at 17:25

answered Nov 24 '14 at 22:16

chux - Reinstate Monica

143,097
13
135
256

I never quite understood how folks read out of that paragraph that it's OK for the compiler to elide assignments and casts, even though they remove all extra range and precision. – tmyklebu Nov 24 '14 at 22:26
@tmyklebu It is easy to agree that even with `FLT_EVAL_METHOD == 1`, both methods should generate `2147483648`. My assertion is because of `FLT_EVAL_METHOD == 1`, the compiler did the `(double ((float) x)` as `(double) x` and rendered 2147483647. As to if this is legitimate, leave than to language lawyers. – chux - Reinstate Monica Nov 24 '14 at 22:35
2

The paragraph starts off with "Except for assignment and cast" , but the code we are discussing is *cast*, so the rest of this paragraph doesn't apply (in particular, the relevance of FLT_EVAL_METHOD) – M.M Nov 24 '14 at 22:36
@Matt McNabb Your assertion about "we are discussing is cast, so the rest of this paragraph doesn't apply" is likely correct and that the code should not have generated 2147483647. I am not stating the code generated the correct code or not as much as I am asserting the compiler used 2 different conversion paths because of `FLT_EVAL_METHOD` was non-zero. The fact that the outputs differ is likely due to an incorrectly following of the "Except for ... cast .." clause as you point out. – chux - Reinstate Monica Nov 24 '14 at 22:43
3

This answer is wrong. No matter what the value of `FLT_EVAL_METHOD` is, an explicit cast to `float` throws away excess precision. This is required by the C standard. If MSVC is not doing that, it's a bug in the compiler. – R.. GitHub STOP HELPING ICE Nov 25 '14 at 04:49
@R The question is "why are both lines different". Certainly the 2147483647 output is due to direct `int` to `double` conversion, omitting the in-between `float`. This omission occurs when `FLT_EVAL_METHOD` is non-zero. Yet it is not expected per C11 when a cast occurs. C99 has different wording which may explain things or this could be caused by an old compiler bug as you suggest or an unposted compiler option over riding normal compilation behaviors. – chux - Reinstate Monica Nov 25 '14 at 05:11
1

@R..: Where in C89, which was the only extant C standard at the time VC++6 was released, is this forbidden? – tmyklebu Nov 25 '14 at 05:17
@R.. In reviewing C99, I was unable to find support for "explicit cast to float throws away excess precision. This is required by the C standard". OTOH it is explicit in C11 which certain came after OP's Visual C++ 6.0. That added phrase in C11 "Except for assignment and cast" certainly applies here. – chux - Reinstate Monica Nov 25 '14 at 14:13
About the difference between standards: please have a look at http://www.open-std.org/jtc1/sc22/wg14/www/docs/dr_290.htm against C99: the text in original C99 was not precise enough, and C11 (or C99+TC3) are better worded; but there is no change in interpretation: because of rest of the standard about assignments and casts, compilers claiming C99-conformance should round to float precision in any case (thus 2147483647 is "impossible".) C89 was likewise, by the same reading. – AntoineL Mar 01 '15 at 23:16
@tmyklebu: in 3.3.4 (the cast operator.) Same text can be found at the corresponding place in K&R1, BTW. – AntoineL Mar 01 '15 at 23:31
Whatever FLT_EVAL_METHOD is, compilers must perform a conversion to the requested type when there is an explicit case. gcc seems to have some bugs in that area. For example, if you have double x, y; then x + y may be evaluated in extended precision, but (double) (x + y) _must_ convert this to double precision. gcc apparently removes the cast in the front end assuming that no conversion is needed when a value that is "officially" double is cast to double. Even if the value is actually extended precision. – gnasher729 Oct 17 '15 at 12:30

score 7 · Answer 2 · answered Nov 24 '14 at 22:58

This is certainly a compiler bug. From the C11 standard we have the following guarantees (C99 was similar):

Types have a set of representable values (implied)
All values representable by float are also representable by double (6.2.5/10)
Converting float to double does not change the value (6.3.1.5/1)
Casting int to float, when the int value is in the set of representable values for float, gives that value.
Casting int to float, when the magnitude of the int value is less than FLT_MAX and the int is not a representable value for float, causes either the next-highest or next-lowest float value to be selected, and which one is selected is implementation-defined. (6.3.1.4/2)

The third of these points guarantees that the float value supplied to printf is not modified by the default argument promotions.

If 2147483647 is representable in float, then (float)x and (float)2147483647 must give 2147483647.000000.

If 2147483647 is not representable in float, then (float)x and (float)2147483647 must either give the next-highest or next-lowest float. They don't both have to make the same selection. But this means that a printout of 2147483647.000000 is not permitted¹, each must either be the higher value or the lower value.

¹ Well - it's theoretically possible that the next-lowest float was 2147483646.9999999... so when the value is displayed with 6-digit precision by printf then it is rounded to give what was seen. But this isn't true in IEEE754 and you could easily experiment to discount this possibility.

Would the next lower `float` be more like `2147483520.0` and not `2147483646.9999999...`? The next lower `double` may be `2147483646.9999998...`. — chux - Reinstate Monica, Nov 24 '14 at 23:08
C99 is from 1999. C89 would be the only reasonable C standard to refer to if you're trying to call this a bug in VC++6. (I'm not convinced it is a VC++6 bug; it's just an old compiler from before it was impressed upon C and C++ implementors that butchering people's floating-point code is bad.) — tmyklebu, Nov 24 '14 at 23:44
@tmyklebu I don't have a copy of C89 text so I'm assuming that this is generally unchanged — M.M, Nov 24 '14 at 23:45
Concerning "C11 standard we have the following guarantees (C99 was similar):", I do not find the explicit guarantee in C99 as C11. — chux - Reinstate Monica, Nov 25 '14 at 14:15

score 2 · Answer 3 · answered Nov 24 '14 at 20:27

2

On the first printf, the conversion from integer to float is done by the compiler. On the second one, it is done by the C runtime library. There is no particular reason why they should produce answers identical at the limits of their precision.

answered Nov 24 '14 at 20:27

Lee Daniel Crocker

12,927
3
29
55

Compiler can perform the conversion for the second one too, not necessarily leave it to C runtime. – P.P Nov 24 '14 at 20:31
Well, one particular reason is that one would hope that an optimisation wouldn't change an observable result... – Oliver Charlesworth Nov 24 '14 at 20:31
@OliverCharlesworth: But it may, if the behavior is unspecified. (Is it, in C89?) – Deduplicator Nov 24 '14 at 20:34
If your code depends on things like this being equal, it's bad code. – Lee Daniel Crocker Nov 24 '14 at 20:35
What I'm getting at is "what leeway does the compiler have, based on the constraints of the standard?". – Oliver Charlesworth Nov 24 '14 at 20:37
IEEE-754 float (single-precision, 32 bits) guarantees conversions in and out to 9 digits. You have 10 here. – Lee Daniel Crocker Nov 24 '14 at 20:42
Well, it guarantees in terms of bits not decimal digits, but I know what you mean. But what do we have in terms of rounding/truncation (which is what must be going on here)? Is the compiler allowed to elide the rounding in `(double)(float)myInt`? – Oliver Charlesworth Nov 24 '14 at 21:08
Frankly, I'm not interested enough to look it up. I will never write code where it matters. – Lee Daniel Crocker Nov 24 '14 at 21:13
@LeeDanielCrocker: You have weird ideas of what constitutes "bad code." If you cannot rely on floating-point numbers working the way they're supposed to, you can't really write numerical code that's both fast and maintainable. – tmyklebu Nov 24 '14 at 22:29
Fortunately I can't think of the last time I needed to write floating point code of any kind, but if I did, I wouldn't expect ten digits of precision from a format that only promises nine. – Lee Daniel Crocker Nov 24 '14 at 22:33
@LeeDanielCrocker: Do you know what a floating-point number is? Do you understand why it makes no sense to talk about "digits of precision" here? Do you know what `2147483648` is and why it's representable as an IEEE binary32 `float`? Do you know why `2147483647` isn't representable as an IEEE binary32 `float`? – tmyklebu Nov 24 '14 at 22:35
Yes, yes, I know. But I still think these results are well within acceptable limits of the format. Actually, I can remember the last time I wrote FP code: the Ziggurat function of ojrandlib. Feel free to check my math. :-) https://github.com/lcrocker/ojrandlib/blob/master/source/library/ziggurat.c – Lee Daniel Crocker Nov 24 '14 at 22:40
@LeeDanielCrocker: You think it's acceptable for a widening conversion to behave nondeterministically? Would it be cool with you if `(double)42` evaluated to `42.375`? – tmyklebu Nov 25 '14 at 05:18
Is that really the level of argument you want to descend to? I'm done. Clearly you have no interest in honest debate. – Lee Daniel Crocker Nov 25 '14 at 16:28

score 0 · Answer 4 · answered Nov 24 '14 at 22:24

0

Visual C++ 6.0 was released last century, and I believe it predates standard C++. It is wholly unsurprising that VC++ 6.0 exhibits broken behaviour.

You'll also note that gcc-3.4.2 is from 2004. Indeed, you're using a 32-bit compiler. gcc on x86 plays rather fast and loose with floating-point math. This may technically be justified by the C standard if gcc sets FLT_EVAL_METHOD to something nonzero.

answered Nov 24 '14 at 22:24

tmyklebu

13,915
3
28
57

1

"VC6 is old and I'm not surprised" doesn't really provide much interesting information or explanation here. – Jason C Nov 25 '14 at 01:26
1

@JasonC: I guess not, but it makes no claim to follow the standard that was released a couple months after it was. Tough to blame it for all its insanity. – tmyklebu Nov 25 '14 at 05:16
Note: How about "Visual C++ 6.0 was released last millennium"? – chux - Reinstate Monica Nov 25 '14 at 05:18

score -1 · Answer 5 · answered Nov 24 '14 at 22:28

some of you guys said that it's a optimization bug, but i am kind of disagree. i think it's a reasonable floating point precision error and a good example showing people how floating point works.

http://ideone.com/Ssw8GR

maybe OP could try to paste my program into your computer and try to compile with your compiler and see what happens. or try:

http://ideone.com/OGypBC

(with explicit float conversion).

anyway, if we calculate the error, it's 4.656612875245797e-10 that much, and should be considered as pretty precise.

it could relate to the preference of printf too.

This is not how floating-point works. It doesn't decide to round or not round things willy-nilly. It has a well-defined semantics that ought to be followed. — tmyklebu, Nov 24 '14 at 22:30

Strange behavior when casting an int to float in C

5 Answers5

Linked