11

I am not asking "why does this calculation result in NaN", I am asking "Why does NaN exist at all, rather than resulting in an exception or error?"

I've been wondering this for a while, and discussed it with people occationally.

The only answers I've gotten have been "Well you don't want to Try-Catch every divide op, do you?", or "There are scenarios where NaN is a valid result".

That being said, I've never recieved a concrete example of NaN being a valid result. Assuming NaN cannot ever be a valid result, I do not understand why it exist at all. If it ever appears, to my knowledge, you have a bug. Period.

You want the program to crash and die then and there so that you can easily find where it went wrong. This rather than letting the program run amok, possibly write corrupt data, possibly send corrupt data, or do all kinds of nasty stuff - before inevitably crashing. (As said in "The Pragmatic Programmer" - Crash, Don't Trash")

Now, I believe the IEEE 754 designers were vastly more intelligent than me, which leads me to believe there HAS to be a reason for its existence. What is this reason?

Peter - Reinstate Monica
  • 15,048
  • 4
  • 37
  • 62
Harald Kanin
  • 133
  • 5
  • 3
    https://stackoverflow.com/a/10059796/1644522 – Taras Shcherban Dec 11 '18 at 08:46
  • 3
    For one, exception handling is often very slow, so the option to not bulk up the code with it to handle some simple `NaN` cases is welcomed. Also, there is no exception handling in C. – Blaze Dec 11 '18 at 08:50
  • 1
    The reason can be found here: https://stackoverflow.com/questions/2618059/in-java-what-does-nan-mean. It used for IEEE 754 arithmetically undefined values, such like 0/0 division. – Tetsuya Yamamoto Dec 11 '18 at 08:50
  • I was answering but question was clossed: - It was invented before most languages had exceptions. - It can be a valid result, I have used it. I would check to see if I got division by zero, and do the appropriate think (yes division by zero is not defined, in the general case, but if you know the situation, then you can go the right thing). - Exceptions should only be used for programming errors, there for trying to operate on a NAN should throw an exception, not the creation of it. But what is an operation (Add invariant, pro/post conditions to your code). – ctrl-alt-delor Dec 11 '18 at 08:57
  • One reason given in the wikipedia article is: "The propagation of quiet NaNs through arithmetic operations allows errors to be detected at the end of a sequence of operations without extensive testing during intermediate stages." – Peter - Reinstate Monica Dec 11 '18 at 08:58
  • 2
    @πάνταῥεῖ I'm not sure how this can be too broad. It's a clear question about a design decision in IEEE 754 that is debatable. Five minutes of googling have only resulted in a single, half-assed argument for me. The reason "there were'nt exceptions in 1985" is not convincing; there were signals. – Peter - Reinstate Monica Dec 11 '18 at 09:00
  • 1
    @Blaze exception handling was not common in 1985 anywhere, if I'm not mistaken. But one could always have trapped. – Peter - Reinstate Monica Dec 11 '18 at 09:10
  • 2
    "rather than resulting in an exception" is the key, exceptions are controversial. A disaster in the previous century, mixing libraries that expect exceptions to be enabled with ones that expect it to be disabled was quite a nightmare. The only real way to get ahead was to just disable them and let NaN do its job. You can override that choice if you're courageous. – Hans Passant Dec 11 '18 at 09:12
  • 1
    Bold part of [this answer](https://stackoverflow.com/a/23666623/1312382) (and initial part of same paragraph) - same question as cited by @Shcherban... – Aconcagua Dec 11 '18 at 09:45
  • On most platforms, you can turn off silent handling of exceptions (note, exception mean a different thing in this context, not the usual, high-level language feature). So, when an exception occurs (invalid operation which results in NaN, and there more exceptions), CPU can generate a trap (interrupt). So, the real question is, why disabled traps become the default? Note, that there are platforms, where turning on traps actually work, and can be used to catch errors early. For example, on Linux, this mostly works (means that libraries are mostly exception free). – geza Dec 11 '18 at 09:49
  • 1
    Note that the mere existence of NaN is not different from other types: For pointers, there is a null pointer which does not point to any object. For integers, there may be trap representations. In Swift, there are Optionals. So it is not the presence of a non-value value in a type that is controversial. Rather, it is the propagation of NaNs through operations and/or the `NaN != NaN` property. – Eric Postpischil Dec 11 '18 at 12:17
  • 1
    Re “If it ever appears, to my knowledge, you have a bug. Period.”: This is false for several reasons, but one I want to point out is that fixed-precision arithmetic is, by logical necessity, only an approximation to real arithmetic. We **desire** to do mathematics on real numbers, but **physics and practicality** limit us to finite precision.We implement mathematical algorithms using the machines we have. In consequence, some calculations produce incorrect results. Limiting these errors is very difficult. Circumstances will arise where an input to a function is outside its domain because… – Eric Postpischil Dec 11 '18 at 12:26
  • … previously calculated results have errors. Therefore, it is **not necessarily an error in design** when `sqrt` is called with a negative value or `acos` is called with a value greater than one. It is part of a design compelled by logic and physics. Providing a NaN result in such situations is part of a strategy affording flexible options to the designers of programs. One option would be to request exceptions in such a circumstances or to test for them. Another is to insert a NaN and continue. The latter is beneficial for some situations. – Eric Postpischil Dec 11 '18 at 12:29
  • I may delete and reword the above when I have time, but I want to clarify this point: You may have an algorithm that is perfect when implemented with real mathematics—it never evaluates a function with an input outside the function’s domain—but that does encounter domain errors when implemented with fixed-precision arithmetic. For such a program, domain errors are a consequence of logic and physics, not of incorrect design. They are something the program is pushed toward against the will of the designer, which the designer must then deal with. – Eric Postpischil Dec 11 '18 at 12:36
  • Another point to make is the IEEE-754 standard is designed to provide options for different users, not a one-size-fits-all solution. The standard provides for traps if somebody wants to interrupt their code or silent exception recording and results substitution (not just with NaNs but also with zeros for underflow, infinities for overflow, and approximate results for inexact operations) for somebody who wants to continue. The practical availability of these options or lack thereof is a result of subsequent language, software, and hardware design and economics. – Eric Postpischil Dec 11 '18 at 13:36
  • @EricPostpischil: have you encountered a problem, where NaN was useful? I mean, as a result of a calculation. I ask this, because all of the time, I avoid invalid operations. For example, I don't let sqrt be called on a negative number. If sqrt(negative) happens, it could mean two things: a) bug (in this case, NaN is not useful) b) from a calculation, I got a slightly less than zero number (which would be impossible with "real" numbers, as mathematically the result >=0, but with floats it happens), and this is the input to sqrt. In this case, the solution is usually to use zero, not NaN. – geza Dec 11 '18 at 16:12
  • 1
    @geza: The issue is not whether NaN is useful (whether it provides some information about some value that was desired) but whether it is useful to have NaNs (whether they serve purposes in computation). The answer to the latter is yes, as has been mentioned elsewhere and above. Many people think in single-threaded single-step terms: If you have an invalid operation, you must do something about it. When working with arrays of millions of elements and doing millions of operations on them, no, you do not want to stop for each piddly exception. You want to process them in bulk. – Eric Postpischil Dec 11 '18 at 16:54
  • @EricPostpischil: I'm not debating whether it is useful to have NaNs or not. :) I'm just asking for an example, where one actually uses them as part of an algorithm. To see, why it is useful to have them. I understand, that in theory, they can be useful. But is there an actual, real world example you've encountered? (I mean, for example, "yes, I've implemented a large matrix inversion routine, and NaNs was useful, because "). I ask this, because in all my algorithms, I needed to actually avoid any operations resulting in NaN, because otherwise I got useless results. – geza Dec 11 '18 at 17:35
  • @geza: I am not an end-user of such routines; I do not write code that needs to have a transform performed, et cetera. I have written such routines. NaNs are critical just to be able to write such routines. When given some large operation to perform, it is not feasible to stop the routine in the middle and return to the caller to patch things up. It is simply standard practice to complete the operation and let the caller deal with NaNs as they need to for their application. Without NaNs, any such interface between different software would be excessively burdened. – Eric Postpischil Dec 11 '18 at 17:50
  • @EricPostpischil: okay, so you basically say that one use case of NaNs is to support GIGO (garbage in,garbage out): if the input is bad, and causes invalid operations, then the output will be garbage (NaN), instead of reporting the exact cause of the problem. I've never encountered such a case when this is useful, but I can accept that there are some use cases of this. – geza Dec 11 '18 at 18:07
  • 1
    @geza: No, the use of NaNs I described is not to support garbage-in, garbage-out. The use of NaNs I described is to **enable the bulk computing of the good results**. That is, it enables is to **ignore** the results we cannot handle (in one piece of software) so that we can process the other results in the rest of the array. The “garbage out” is not the desired information; the good data in the rest of the array is. Additionally, one piece of information the NaNs carry is where in the area the results are not good. – Eric Postpischil Dec 11 '18 at 18:14
  • 1
    To give a crude square-peg round-hole example, suppose you have a circular set of points from some sensor, but of course we like to structure our data in rectangular arrays. We might well fill a circle in the array with sensor data and put NaNs in the rest. Then we can go to town on processing the data, using bulk computing in various forms (GPU, SIMD, threads, whatever), happily ignoring the NaNs and processing all the good data. An alternative might put zeros outside the circle but that is wrong—we have **no data** there, not zeros. – Eric Postpischil Dec 11 '18 at 18:17
  • Additionally, if any filters are used that mix neighboring pixels, NaN propagation helpfully tracks the borders as they move. – Eric Postpischil Dec 11 '18 at 18:18
  • "Now, I believe the IEEE 754 designers were vastly more intelligent than me, which leads me to believe there HAS to be a reason for its existence. What is this reason?" Not exactly. The time when IEEE 754 (1985 afaik) was when programming didn't look like nowadays. It was even before the C language was published (1989 afaik). Fancy stuff like try-catch error handling, user-friendly debuggers etc. was rare, if already invented at all. You may want to see how C (and PHP too) returns false or -1 for function error calls instead of throwing an error. – SOFe Dec 20 '18 at 08:31
  • In addition, floating point operations are performed at very low levels (probably with the CPU), and the CPU doesn't have exceptions (at least not the way you think about it, if performance is to be considered). The floating point unit of the CPU just takes two operands from two registers, computes the result and write the result to another register. The register *has to contain something*. So a better question is to ask why it exists in programming languages. But NaN does not always come from division. It can come from string parsing. Or the developer used it intentionally to void the expr. – SOFe Dec 20 '18 at 08:34

1 Answers1

2

Whenever I write something about math, I am afraid of being knocked out with a metal bar by a real mathematician, but here we go, confronting our fears:

"Why does NaN exist at all, rather than resulting in an exception or error?"

Because it is neither an exception nor an error. It is a perfectly valid result for a calculation. You have several use cases in mathematics where you are receiving the equivalent to "NaN", i.e., something that cannot be measured. Think the calculation of an intersection between two parallel lines. Or the calculation of the mass of a photon.

In these cases, where you are going for "the math side of life" in your code (I can imagine this applies mainly to scientific software), we have the following situation:

  • They are no errors, the variables have the values that they need to have, the calculations have been done, and this is the result. Sadly, not a real number (A complex one maybe? Maybe an indetermination that can be solved using other mathematical methods?), still you have the answer to the calculation.
  • They are no exceptions, nothing is wrong with your code, this is not an anomalous condition, this is the answer: "NaN" (where you expecting 42?). No need to stop the flow of your program here: inform the user that the calculation has no solution or it is a non-determined one, and let him be happy with it (note: I lie a bit, read the latest paragraph).

You want the program to crash and die then and there so that you can easily find where it went wrong.

You are totally right: you may still handle it as an exception or as an error if the concept of NaN is not a valid result in the context of your program .Imagine calculating the necessary diameter of a column for a football stadium and getting NaN. Ugh... that would be for sure an error, I want to build that stadium, give me a diameter!. You are also totally wrong: please, don´t tell you will find "easily" where it went wrong just because you trigger an exception after NaN. I knew some people debugging meteorological models that would like to have a word with you (these equations are not fun).

You have nowadays many libraries and implementations that take opinionated decisions about what are errors, what are exceptions, etc. The IEEE designers left the decision up to you. And language coders passed along that power to you. Use it wisely.

And if you read until here, let me tell you I was lying a bit to you for the sake of oversimplification. The IEEE linked defines two kinds of NaNs, quiet ones and signaling ones. I have been talking about the quiet ones, the good guys. The signaling ones will cause exceptions in your software (overflowing, underflowing, etc).

Moreno
  • 526
  • 2
  • 14