2

Just another one of those things in programming that have been bothering me a lot.

My Problem

So I have been toying around with this IEEE-754 Floating Point Converter for a while now and I found that, when the exponent is set to its maximum value, a mantissa value of zero will result in positive or negative Infinity (depending on what the sign is set to), and all non-zero mantissa values will result in NaN (aka Not a Number).

However, is there any specific reason for there being so many NaN values? Honestly, it just looks like an unnecessary waste of range to me. I get why NaN exists, but it's just a singular value and I've never seen a differentiation between "different kinds" of NaN anywhere.

My Idea

How about this: If both the exponent and the mantissa are at their maximum values, treat the value as Infinity (whether positive or negative still depends on the sign). This would get rid of the NaN values and almost double the range of the float and double types.

But now, where do we put the NaN value? Well, there is one other thing in IEEE-754 Floating Point types that can be achieved by different values: Zero. There is +0 and -0. They are always treated the same way, so why not discard -0 and consider that to be NaN? Of course, that would require the processor to always remove the sign whenever something results in 0, but I think that would make close to no difference in performance.

I know the standard will most likely not change anytime soon, but is there any specific reason why this system wouldn't be better? And to come back to the original question: Is there a reason for such a large chunk of values to be considered NaN?

Nyde
  • 164
  • 7
  • `-0` and `+0` are used to differentiate various things like n/-0.0 and n/+0.0 which are -inf and +inf respectively. Negative denormals will become -0 if denormals are not supported – phuclv Nov 24 '20 at 02:33
  • @phuclv Except, +inf and -inf are not differentiated using n/+0.0 and n/-0.0, but using +n/0.0 and -n/0.0 (at least in the programming languages I know). – Nyde Nov 24 '20 at 09:17
  • then those languages don't use IEEE-754, because [IEEE-754 requires `1/-0.0 = -∞` and `1/0.0 = +∞`](https://en.wikipedia.org/wiki/Signed_zero). Demo for [Java](https://tio.run/##y0osS9TNSsn@/7@gNCknM1khOSexuFjBNzEzT6GaixMqWFySWAKkyvIzUxRygVIawSVFmXnp0bEKiUXpxZoglZzBlcUlqbl6@aUlegVAyZKcPA1DfV0DPQNNa1yy2lDZWq7a//8B), [C++](https://tio.run/##Sy4o0E1PTv7/XzkzLzmnNCVVwSYzv7ikKDUx144rM69EITcxM09DU6Gai7O4JMXKKjm/tETBxkbBUF/XQM8AxFIHQrCANlQArC41LyXHmqv2/38A). See also [Why is negative zero important?](https://softwareengineering.stackexchange.com/q/280648/98103) – phuclv Nov 24 '20 at 09:29
  • Demo for [C#](https://tio.run/##Sy7WTc4vSv3/v7Q4My9dIbiyuCQ115orOSexuFghoCg/vSgxV6Gai7O4JLEkM1mhLD8zRcE3MTNPo7ikCKghOlYhsSi9WBOkhNM5P684PydVL7wosyTVJzMvVcNQX9dAz0DTGoekNlSylqv2/38A), [JavaScript](https://tio.run/##y0osSyxOLsosKNEts/j/v6AoM69Ew1Bf10DPQJMLxtMG8f7/BwA), [PowerShell](https://tio.run/##K8gvTy0qzkjNydFNzi9K/f/fUF/XQM@Ay1BfG0j9/w8A), [Pascal](https://tio.run/##K0gsTk7M0U0rSP7/Pyk1PTOPi7O8KLMkVcNQX9dAz0DTGsHXhvBT81L0/v8HAA) – phuclv Nov 24 '20 at 09:36
  • @phuclv I'm pretty sure C# uses IEEE-754. – Nyde Nov 24 '20 at 09:38
  • see the C# demo above. 1/-0.0 and 1/+0.0 still results in the correct infinity – phuclv Nov 24 '20 at 09:39
  • @phuclv Oh, sorry, I didn't see your second comment there ':D I was just confused because in the Single and Double CLR structs, +inf is defined as 1D / 0D and -inf as -1D / 0D. Thanks for clearing that up. :D – Nyde Nov 24 '20 at 09:43
  • In fact negative zero is important for producing the most correct result when *graceful underflow* is disabled or not available – phuclv Nov 24 '20 at 09:46
  • Does this answer your question? [Why does IEEE 754 reserve so many NaN values?](https://stackoverflow.com/questions/19800415/why-does-ieee-754-reserve-so-many-nan-values) – Steve Summit Mar 01 '23 at 13:08

2 Answers2

3

The NaN is not a singular value.

First, at the level of floating-point data (Level 2 in Table 3.1 of IEEE 754-2008), there are two NaNs, a quiet NaN and a signaling NaN. These also appear at the level of representations of floating-point data (Level 3).

Second, as bit strings that encode floating-point data (Level 4), the first bit of the significand field encodes whether the NaN is quiet (1) or signaling (0), and the remaining bits may be used for diagnostic information (as long as they are not all zero for a signaling NaN, as that would represent +∞), per IEEE 754-2008 6.2.1:

When encoded, all NaNs have a sign bit and a pattern of bits necessary to identify the encoding as a NaN and which determines its kind (sNaN vs. qNaN). The remaining bits, which are in the trailing significand field, encode the payload, which might be diagnostic information (see above).

For example, the bits could be set to the address of the instruction that generated the NaN (or a hash of it). Or an array of floating-point objects might be initialized to NaNs with particular significand bits, and then any NaNs in program results could indicate whether they came from this initial data or were generated during execution.

I vaguely recall one of the IEEE-754 committee members had some other creative use for the payload field, but I do not recall what it was.

Eric Postpischil
  • 195,579
  • 13
  • 168
  • 312
  • [What uses do floating point NaN payloads have?](https://stackoverflow.com/q/33967804/995714) – phuclv Nov 24 '20 at 02:39
  • Okay, thanks a lot, this perfectly answers my question. However, I still wonder... Is any of this actually used in practice? Because, judging from the top answer on the question linked by @phuclv I assume that this is not the case because the NaN values are not really "stable", as in they might change unpredictably throughout operations. – Nyde Nov 24 '20 at 09:22
-1

It is pretty astonishing, isn't it? In IEEE-754 single precision, there are 16,777,214 distinct NaN values, and in double precision, there are no fewer than 9,007,199,254,740,990, or 9 quadrillion of them.

This is speculation, but I've assumed that part of the reason was simply: ease of implementation. IEEE-754 floating-point is already hard to implement, especially due to the presence of subnormals. If you take one entire exponent value (one whole binade) and call it "special", you can probably deal with the whole infinity-versus-NaN-versus-normal-versus-subnormal scheme significantly more efficiently — fewer clock cycles, fewer gates — than if you tried to have that highest binade be mostly normal, but with just a couple of special values carved out for infinity and NaN.

One thing to remember is that when you're designing the hardware (or the microcode, or the firmware, or the software) to do this, you have to handle all the cases, and for every operation. Infinities and NaNs can occur both on input, and on output, of just about any operation. There are lots of special cases already, and adding even more can cause things to snowball out of control. (Just ask Intel, who once managed to get floating-point division quite significantly wrong, for some fraction of inputs.)

It's also a useful property of IEEE-754 floating-point that the underlying bit patterns are monotonically increasing along with the corresponding values. The bit pattern for zero is 1 less than the bit pattern for the smallest subnormal, and the bit pattern for infinity is 1 more than for the largest normal number. So an "improved" scheme would probably want to place infinity a few steps below the top of the last binade, so that NaN (or some small number of distinct NaNs) could be above it (that is, not down near 0 as you had suggested).

Finally, it may be worth pointing out that other floating-point formats typically wasted some fraction of their values, too. For example, I believe the DEC floating-point formats (PDP-11 and VAX) essentially expended an entire binade just on zero.

Steve Summit
  • 45,437
  • 7
  • 70
  • 103