69

It seems that the IEEE 754 standard defines 16,777,214 32-bit floating point values as NaNs, or 0.4% of all possible values.

I wonder what is the rationale for reserving so many useful values, while only 2 ones essentially needed: one for signaling and one for quiet NaN.

Sorry if this question is trivial, I couldn't find any explanation on the internet.

Pascal Cuoq
  • 79,187
  • 7
  • 161
  • 281
leventov
  • 14,760
  • 11
  • 69
  • 98
  • 1
    Note that the percentage of possible values that NaN values represent go down as the size of the format increases, as the number of bits allocated to the exponent is proportionally lower in larger IEEE 754 binary formats. – Pascal Cuoq Nov 06 '13 at 06:53
  • 3
    @Pacerier: the question is correct; there are 2^24 - 2 NaNs in binary32. (The missing two are positive and negative infinity.) – Nick Matteo Jun 18 '16 at 16:55
  • 2
    So dynamic language implementors can use that space for all their other non-float objects. – Alex Shroyer Jan 23 '20 at 15:18

3 Answers3

43

The IEEE-754 standard defines a NaN as a number with all ones in the exponent, and a non-zero significand. The highest-order bit in the significand specifies whether the NaN is a signaling or quiet one. The remaining bits of the significand form what is referred to as the payload of the NaN.

Whenever one of the operands of an operation is a NaN, the result is a NaN, and the payload of the result equals the payload of one of the NaN operands. Payload preservation is important for efficiency in scientific computing, and at least one company has proposed using NaN payloads for proprietary uses.

In more basic terms, a NaN doesn't carry any useful numerical information, and the entire 32 bits must be reserved anyway, so the unused bits in the significand would be otherwise wasted if there were not a payload defined in the standard.

muru
  • 4,723
  • 1
  • 34
  • 78
Robert Harvey
  • 178,213
  • 47
  • 333
  • 501
  • 1
    1) As far as I understand it is proposed to provide tools for utilizing NaN playload, It seems as attempt to reduce the harm from value waste, but not something meaningful itself which was kept in mind when IEEE-754 standard was developed. – leventov Nov 05 '13 at 23:09
  • 2) I don't understand the last passage of your answer – leventov Nov 05 '13 at 23:10
  • 1
    The standard is intentionally vague, describing those bits *only* as a payload. The last paragraph merely states that if you don't use those bits, they simply go unused. Those bits *have* to be there; might as well make it possible to use them for something. – Robert Harvey Nov 05 '13 at 23:10
  • 27
    “The IEEE-754 standard defines a NaN as a number with all ones in the exponent, and a non-zero significand… the entire 32 bits must be reserved anyway” taints this answer in a “it is so because it is so”, color (but then again, this is a “Why…?” question). Instead of `MAX_FLOAT` being defined as `0x1.fffffep127`, it could have been defined as `0x1.fffff0p128`, with the last couple of bit patterns only being used for `inf` and `NaN`, instead of an entire exponent value being sacrificed to them. – Pascal Cuoq Nov 06 '13 at 07:04
  • 9
    Speculation as to why this other representation wasn't chosen may include simplicity of hardware implementations, and the fact that rigorous mathematical proofs about floating-point behavior require enough special cases already even as representable normal values are available in whole binades. The payload thing really looks like an afterthought, especially since the standard does not even specify which of the two payloads end up in the result when a binary operation is applied to two NaNs (a symmetric choice would have been to specify the bitwise or of the two payloads, for instance). – Pascal Cuoq Nov 06 '13 at 07:10
  • 1
    @PascalCuoq: It does indeed appear to be an afterthought, which is why I refrained from speculation about its purpose. – Robert Harvey Nov 06 '13 at 16:30
  • 5
    One thing that should be noted is that the "preservation of NaN payloads" is a "should" not a "shall" provision of the standard. Section 6.2 reads in part: "To facilitate propagation of diagnostic information contained in NaNs, as much of that information as possible should be preserved in NaN results of operations". – njuffa Nov 06 '13 at 17:23
  • 1
    @RobertHarvey, Even though IEEE 754-**1995** already has multiple NaN representations, Does the concept of signalling and quiet NaNs exist before the IEEE 754-**2008** revision? – Pacerier Aug 22 '14 at 00:11
  • I used to think that the square roots of different negative numbers would give different NaNs, but, at least with my JVM installation on Windows, all the values I've tried give the same NaN: 7FF8000000000000. – Alonso del Arte May 27 '23 at 03:42
  • @AlonsodelArte: Why would you expect a different NaN from a number that is already imaginary? What would you propose as a unique placeholder for each square root of a negative number, and how would that be useful? – Robert Harvey May 27 '23 at 14:44
  • @RobertHarvey I thought maybe the algorithm would reveal some numbers from an intermediate step in the calculation. I wasn't expecting the NaNs to be unique and distinct, I was expecting them to be just different enough to be interesting, e.g., the NaN for sqrt(−150994944.0) might be the same as the NaN for sqrt(−2304.0) but they'd be different from the NaN for sqrt(−2415.02). I'm not making any proposals. – Alonso del Arte May 29 '23 at 00:48
33

According to this series of notes by William Kahan, one of the designers of the IEEE-754 format, the use of multiple NaNs was intended to let hardware fill in information about what triggered the NaN in the first place so that computations that would end up resulting in NaN could run to completion and then allow the programmer to write code to analyze what had gone wrong:

IEEE 754's specification for NaN endows it with a field of bits into which software can record, say, how and/or where the NaN came into existence. That information would be extremely helpful for subsequent “Retrospective Diagnosis” of malfunctioning computations, but no software exists now to employ it. Customarily that field has been copied from an operand NaN to the result NaN of every arithmetic operation, or filled with binary 1000...000 when a new NaN was created by an untrapped INVALID operation. For lack of software to exploit it, that custom has been atrophying.

So it seems like this was intentional and left unspecified so that different systems could handle things differently. In retrospect, it seems like this never really ended up happening, but it seems like a reasonable idea!

templatetypedef
  • 362,284
  • 104
  • 897
  • 1,065
  • 7
    Especially reasonable in the days of mainframes where you submitted a job as a stack of punch-cards, and later went to pick up a print-out of the result. Interactive debugging devalues this significantly for most use-cases. – Peter Cordes Mar 28 '19 at 02:04
2

There is likewise a payload for 64 bit floating point numbers as well, with ~10^15 possible values. Unfortunately, implementations diverge as to how the payload should be transferred between 32 and 64 bit floating point numbers and back again, i.e. whether you preserve the most significant or least significant bits. Since payload treatment is machine specific, you need different code to deal with payloads on different machines.

I wouldn't worry too much about which NaN payload is propagated after a binary operation. NaNs are exceptional values that occur with low probability, and the probability of getting 2 of them is unlikely.

Reality Pixels
  • 376
  • 2
  • 6
  • Computers are however deterministic, not probabilistic. So if a NaN ist meant to occur, it will. – Rainb Aug 14 '20 at 05:51
  • @Rainb Non-deterministic computers exist. Take a look at a run-of-the-mill CPU with multiple threads or GPU. – Thomas Eding Jan 11 '22 at 00:39