24

Ignoring why I would want to do this, the 754 IEEE fp standard doesn't define the behavior for the following:

float h = NAN;
printf("%x %d\n", (int)h, (int)h);

Gives: 80000000 -2147483648

Basically, regardless of what value of NAN I give, it outputs 80000000 (hex) or -2147483648 (dec). Is there a reason for this and/or is this correct behavior? If so, how come?

The way I'm giving it different values of NaN are here: How can I manually set the bit value of a float that equates to NaN?

So basically, are there cases where the payload of the NaN affects the output of the cast?

Thanks!

Community
  • 1
  • 1
Chris
  • 753
  • 2
  • 8
  • 22

3 Answers3

18

The result of a cast of a floating point number to an integer is undefined/unspecified for values not in the range of the integer variable (±1 for truncation).

Clause 6.3.1.4:

When a finite value of real floating type is converted to an integer type other than _Bool, the fractional part is discarded (i.e., the value is truncated toward zero). If the value of the integral part cannot be represented by the integer type, the behavior is undefined.

If the implementation defines __STDC_IEC_559__, then for conversions from a floating-point type to an integer type other than _BOOL:

if the floating value is infinite or NaN or if the integral part of the floating value exceeds the range of the integer type, then the "invalid" floating- point exception is raised and the resulting value is unspecified.

(Annex F [normative], point 4.)

If the implementation doesn't define __STDC_IEC_559__, then all bets are off.

Daniel Fischer
  • 181,706
  • 17
  • 308
  • 431
  • 1
    Given the fact that the behavior is undefined, is the result I got the common one for this undefined behavior? i.e. is anyone aware of a system were I would get different behavior than this? The 754 spec says the behavior of NaN operations is that the payload should be carried through. – Chris Apr 28 '12 at 19:00
  • I'm not aware of an implementation that does otherwise, but I'm not familiar with anything beyond a bit of gcc. gcc produces `INT_MIN` for all out-of-range conversions to `int`, as far as I know (but that's also only very little). – Daniel Fischer Apr 28 '12 at 19:05
  • 1
    I'm pretty sure you mean *gcc on x86*. There's no reason to assume the result should be the same everywhere else; this is likely an artifact of the fpu's behavior. – R.. GitHub STOP HELPING ICE Apr 28 '12 at 22:30
  • 3
    Oh, um, *blush*, I sure do. But of course, non-x86 hardware is a myth invented by apple to sell more Macs. (Thanks for the correction, @R..) – Daniel Fischer Apr 28 '12 at 23:39
  • 1
    I thought it was a myth invented by Google to sell phones. ;-) – R.. GitHub STOP HELPING ICE Apr 29 '12 at 00:00
  • GCC itself will give you `0` when converting NAN to int at compile-time (rather than run-time, where you get INT_MIN). So even on a single platform you can get two different values, depending on whether the compiler was able to determine your NAN as a constant value. – John Zwinck Aug 11 '14 at 05:06
  • The result is not undefined, but one gets an unspecified value (see ISO C17, F.4). With GCC 4.6 to the trunk (10.0.0 snapshot), under Linux/x86_64 (Debian/unstable), I get for conversions from `volatile double` NAN to `int`, `unsigned int`, `long`, `unsigned long`: `INT_MIN`, 0, `LONG_MIN`, `LONG_MAX`+1 respectively (the last two values have the same representation, but this is not the case for the `int` and `unsigned int` results). – vinc17 Jan 27 '20 at 16:19
  • @vinc17 Thanks for the heads-up. Turns out that was already the case in C11, but I hadn't read the annex and went by just what was stated in 6.3.1.4. – Daniel Fischer Jan 27 '20 at 16:51
  • To remind: "defines `__STDC_IEC_559__`" does not mean that the implementation conforms to the specifications in the annex F. It means that it _may_ (or has an intent to) conform to the specifications in the annex F. – pmor Mar 28 '22 at 12:02
12

There is a reason for this behavior, but it is not something you should usually rely on.

As you note, IEEE-754 does not specify what happens when you convert a floating-point NaN to an integer, except that it should raise an invalid operation exception, which your compiler probably ignores. The C standard says the behavior is undefined, which means not only do you not know what integer result you will get, you do not know what your program will do at all; the standard allows the program to abort or get crazy results or do anything. You probably executed this program on an Intel processor, and your compiler probably did the conversion using one of the built-in instructions. Intel specifies instruction behavior very carefully, and the behavior for converting a floating-point NaN to a 32-bit integer is to return 0x80000000, regardless of the payload of the NaN, which is what you observed.

Because Intel specifies the instruction behavior, you can rely on it if you know the instruction used. However, since the compiler does not provide such guarantees to you, you cannot rely on this instruction being used.

Eric Postpischil
  • 195,579
  • 13
  • 168
  • 312
  • 2
    It may be true that Intel processors convert NAN to a 32-bit int as `0x80000000`, but this won't help you if your NAN is a constant value as determined by your compiler. In such cases you may see values other than INT_MIN, because the conversion is done at compile time rather than runtime, so Intel's x86 semantics never come into play. For example, when GCC converts NAN to int at compile time, it gives 0. – John Zwinck Aug 11 '14 at 05:00
  • https://www.felixcloutier.com/x86/cvttsd2si is the instruction in question. x86's "integer indefinite" value is MSB=1, rest = 0, i.e. INT_MIN or INT64_MIN. As you say, a different usage of the instruction can have different results, e.g. float -> uint32_t on x86-64 will often convert to int64_t and take the low half because that's basically free in asm, and x86 (before AVX-512) doesn't provide FP -> unsigned conversions directly. (C doesn't define the behaviour of negative FP -> unsigned; the modular reduction is only for wide integral type -> unsigned). – Peter Cordes Apr 12 '21 at 00:01
  • As you say, other ISAs can be different, e.g. [unsigned conversion in C works as expected on x86 but not ARM](https://stackoverflow.com/q/60925860) – Peter Cordes Apr 12 '21 at 00:01
3

First, a NAN is everything not considered a float number according to the IEEE standard. So it can be several things. In the compiler I work with there is NAN and -NAN, so it's not about only one value.

Second, every compiler has its isnan set of functions to test for this case, so the programmer doesn't have to deal with the bits himself. To summarize, I don't think peeking at the value makes any difference. You might peek the value to see its IEEE construction, like sign, mantissa and exponent, but, again, each compiler gives its own functions (or better say, library) to deal with it.

I do have more to say about your testing, however.

float h = NAN;
printf("%x %d\n", (int)h, (int)h);

The casting you did trucates the float for converting it to an int. If you want to get the integer represented by the float, do the following

printf("%x %d\n", *(int *)&h, *(int *)&h);

That is, you take the address of the float, then refer to it as a pointer to int, and eventually take the int value. This way the bit representation is preserved.

Israel Unterman
  • 13,158
  • 4
  • 28
  • 35
  • hi @Israel `printf("%x %d\n", *(int *)&h, *(int *)&h);` this is a good way to get the bit representation from an address, is there anyway to write back the bit representation to an address? say write 0x7ff8000000000000 to `&h`? – hukeping Jan 17 '20 at 07:17
  • @hukeping Are you aware that `0x7ff8000000000000` is a 64 bit integer value while a `float` is stored with 32 bits only? – Scheff's Cat Jul 12 '22 at 13:05