70

I was looking through some things in the source of java.lang.Math, and I noticed that while Math.min(int, int) (or its long counterpart) is implemented this way:

public static int min(int a, int b) {
   return a <= b ? a : b;
}

And this makes complete sense to me and it is the same as what I would do. However, the double/float implementation is this:

public static float min(float a, float b) {
   if (a != a) {
      return a;
   } else if (a == 0.0F && b == 0.0F && (long)Float.floatToRawIntBits(b) == negativeZeroFloatBits) {
      return b;
   } else {
      return a <= b ? a : b;
   }
}

I'm completely dumbfounded. Comparing a to itself? What's the second check even for? Why isn't it implemented in the same way as the int/long version?

Lino
  • 19,604
  • 6
  • 47
  • 65
Yonatan Avhar
  • 853
  • 7
  • 6
  • 51
    You do not seem to have read the documentation or at least you do not refer to it. While looking at code is nice, looking at documentation is important too. – NoDataDumpNoContribution May 06 '21 at 08:06
  • 28
    @Trilarion: while this behaviour is explained in the JavaDoc, it's very possible that it's the first time that many people will see `NaN` and `-0.0`. And if you don't know what they are and why they should be handled differently, then the JavaDoc doesn't really illuminate the issue. In other words: the JavaDoc is a good reference here, but not a good teacher. – Joachim Sauer May 06 '21 at 08:10
  • 2
    related: [What is the instruction that gives branchless FP min and max on x86?](https://stackoverflow.com/q/40196817) - the hardware instruction implements the simple ternary (like C++ `std::min`), not the always-NaN-propagating min(float, float) that's like C/C++ `fmin`). – Peter Cordes May 06 '21 at 19:07
  • 1
    @PeterCordes: that's a fascinating read. For anyone interested in what Java *actually executes*: I'm almost sure that every overload of `Math.min` is an intrinsic in OpenJDK and as thus will have dedicated native implementations (that likely will map to the correct hardware instructions, if available), but that's definitely outside the scope of this question. – Joachim Sauer May 06 '21 at 19:23
  • 4
    Related: [How can a Java variable be different from itself?](https://stackoverflow.com/q/19416644) and [How can a primitive float value be -0.0? What does that mean?](https://stackoverflow.com/q/6724031). Note that one variable is called "negativeZero...", from where one might reasonably try to Google what "negative zero" is (in Java or elsewhere). – Bernhard Barker May 06 '21 at 19:53
  • 5
    @Yonatan: Based on the code formatting, I suspect you may have been looking at the output of a decompiler, which would lack the comments (both javadocs and inline) that everyone else is referring to, rather than [the actual JDK source](https://github.com/openjdk/jdk/blob/jdk-16-ga/src/java.base/share/classes/java/lang/Math.java#L1610-L1635). If you are using an IDE, you should look up how to get the source attached properly so that you can see the JDK source code as it was originally written. – Miles May 07 '21 at 02:50

4 Answers4

103

Floating-point numbers are way more complicated than integer values.

For this specific case two distinctions are important:

  • NaN is a valid value for float and double which represents "not a number" and behaves weirdly. Namely, it doesn't compare equal to itself.
  • Floating point numbers can differentiate between 0.0 and -0.0. A negative zero could conceivably be useful when you're calculating the limit of some function. Distinguishing whether a limit approaches 0 from the positive or the negative direction could be beneficial.

So this part:

if (a != a) {
      return a;
}

ensures that NaN is returned if a is NaN (if a is not NaN, but b is, then the "normal" check later on will return b, i.e. NaN, so no explicit check is needed for this case). This is a common pattern: when calculating anything where one input is NaN, the output will also be NaN. Since NaN usually represents some error in the calculation (such as dividing 0 by 0), it's important that it "poisons" all further calculations to ensure the error isn't silently swallowed.

This part:

if (a == 0.0F && b == 0.0F && (long)Float.floatToRawIntBits(b) == negativeZeroFloatBits) {
      return b;
}

ensures that if you compare two zero-valued floating point numbers and b is negative zero then that negative zero is returned (since -0.0 is "smaller" than 0.0). Similarly to NaN the normal check will correctly return a if it's -0.0 and b is 0.0.

Joachim Sauer
  • 302,674
  • 57
  • 556
  • 614
  • 13
    +1 but nitpick: "some error in the calculation (such as dividing a value by 0)". 0/0 is NaN, x/0 for other numbers is usually +/- Infinity. – cyco130 May 06 '21 at 16:39
  • 15
    Might want to note that if it's `b` that's a NaN, the exact shape of the conditional in the last branch (`a <= b ? a : b`) matters. `b` has to be on the falsy side to be returned for a NaN there. Similarly if `a = -0`, and `b = +0`, the last conditional compares equal and returns `a`. Both together mean that only those two explicit tests are needed, not the full four. – ilkkachu May 06 '21 at 17:54
  • 2
    @EricDuminil: yes, it's a bit of an oversimplification, as ikkachu noted. I'll edit. – Joachim Sauer May 06 '21 at 19:12
  • 2
    I think `a == 0.0F && b == 0.0F` can be optimized to `a == b`. Maybe the idea is that `Float.floatToRawIntBits` is expensive, so they're trying to short-circuit it away in more cases? Perhaps `a==b && b == 0.0F`, then, to maybe avoid needing a constant in the common not-equal case. If that `==` is false, then on machines like x86, FLAGS will already be set according to an `a compare b` when the ternary is reached. (Which isn't helpful if you're going to use a branchless `minss` instruction for that...) – Peter Cordes May 07 '21 at 07:13
  • 1
    @PeterCordes: I don't think this question/answer should focus on the optimal implementation, but just explain *why* it's doing what it's doing. Besides, as I commented on the question itself: it's very likely that this code is never actually implemented, because there are intrinsics in place. – Joachim Sauer May 07 '21 at 07:24
  • s/never actually implemented/never actually executed/ in my previous comment. – Joachim Sauer May 07 '21 at 09:52
  • Not all floating-point formats support NaN. IEEE-754 does, and most systems nowadays typically support the IEEE-754 floating-point formats, but even today there are processors with their own floating-point formats, some of which do not support encoding NaNs. If you're curious about how floating-point number representations work, this might be a good place to start: https://docs.oracle.com/cd/E19957-01/806-3568/ncg_goldberg.html – George May 07 '21 at 14:58
  • @George: all of that is true and interesting, but also irrelevant. This is about Java and that specifies that NaN exists and how it behaves. – Joachim Sauer May 07 '21 at 15:50
  • 1
    @PeterCordes: I suspect that `floatToRawIntBits` may be expensive on some platforms. If it weren't, I think one could simply convert `a` and `b` to `long` values `aa` and `bb`, and then compare `aa ^ ((aa >> 63) >>> 1)` and `bb ^ ((bb >> 63) >>> 1)` without any special handling for NaN, negative zero, etc. – supercat May 07 '21 at 19:01
  • @supercat: Interesting, that could almost be competitive with using SSE packed FP compare and blend to implement `fmin`, at least for 32-bit float. For `double`, you'd need SSE4.2 `pcmpgtq` to efficiently implement that sign-bit broadcast since x86 doesn't actually have XMM 64-bit arithmetic right shifts. Or simply SSE4.1 `blendvps` / `pd` to select based on a sign bit between `aa` and `aa ^ 0x7fff...`. Or of course a naive implementation would just `movd` or `movq` to an integer reg. Fast-ish on x86, but slower on legacy x87 fst/mov, and on ARM can stall the pipeline to mov NEON to int. – Peter Cordes May 07 '21 at 19:15
  • @Joachim Sauer: Not irrelevant at all. Java implements a subset of IEEE-754, so all the IEEE-754 parts are relevant, and the parts are about systems that don't have IEEE-754-compliant floating-point units and formats are relevant when interfacing with such systems, whether over the network or when calling native code. – George May 07 '21 at 20:26
  • @supercat No. The IEEE-754 standard requires NANs to compare unequal to everything, including themselves. As such, you absolutely need a special cases to handle NANs in both operands. You can use your bit fiddling for the rest. – cmaster - reinstate monica May 08 '21 at 15:47
  • @cmaster-reinstatemonica: The purpose of `Math.min` is to perform a non-broken comparison, which is why I was suggesting that comparing the representations as `Long`, after some adjustment, would avoid the need for the special cases associated with `NaN`. Since it's not using broken IEEE floating-point comparisons, it wouldn't need special cases to deal with that brokenness. – supercat May 08 '21 at 16:58
  • @cmaster-reinstatemonica: BTW, I wonder if anyone proposed defining things so that `NaN <= NaN` and `NaN >= NaN` would both be false, but `NaN == NaN` would be true, and also perhaps have `-0 < +0` and `-0 == +0` both be true. That would allow -0 and NaN to be identified using two normal comparisons, but reduce the number of situations where code would have to jump through hoops to deal with quirks in ordinary comparison operators. – supercat May 08 '21 at 17:04
  • @supercat Well, the implementation given in the question returns a `NAN` whenever one of its operands is NAN. For a `min()` implementation, that means that `NAN`s effectively sort lower than `-inf`. Since `NAN`s can have either 0 or 1 as their sign bit, and also have a nonzero mantissa, it's not that straight forward to massage all possible `NAN`s into a single block of values at the low end of the integer range. And, no, defining `-0.0`, `+/- inf`, `NAN`, and using them appropriately in arithmetic and comparison operations is not a bug, it's a feature. – cmaster - reinstate monica May 08 '21 at 17:37
  • @cmaster-reinstatemonica: I don't know what bit patterns Java implementations generate for operations that produce NaN. I think Java's canonical NaN has its sign bit set, and some implementations probably set it for all operations that generate NaN. As for bug vs feature, I'd regard the lack of any convenient means of testing equivalence as a bug. The only advantage the specified == behavior has over regarding any *particular* NaN as equal to itself (but allowing different operations that yield NaN to yield NaNs that may or may not compare equal to each other) is that... – supercat May 08 '21 at 20:16
  • Folks, I'm afraid this discussion is veerying very far from the original question and doesn't help get anywhere related to the question, maybe move it to chat? – Joachim Sauer May 08 '21 at 20:21
  • ...code may test for NaN using `x!=x` rather than `!(x<=x || x>=x)`, though there may have been better ways of specifying relational operators which would require a slightly different recipe. – supercat May 08 '21 at 20:21
40

I recommend carefully reading the documentation for Math.min and also the numeric comparison operators on floating points. Their behaviours are quite different.

Relevant parts from Math.min:

If either value is NaN, then the result is NaN. Unlike the numerical comparison operators, this method considers negative zero to be strictly smaller than positive zero.

and from JLS §15.20.1 "Numerical Comparison Operators <, <=, >, and >="

The result of a floating-point comparison, as determined by the specification of the IEEE 754 standard, is:

  • If either operand is NaN, then the result is false.

  • Positive zero and negative zero are considered equal.

If any argument is NaN, Math.min picks that one, but if any operand is NaN, <= evaluates to false. This is why it has to check if a not equal to itself - this would mean a is NaN. If a is not NaN but b is, the last case would cover it.

Math.min also considers -0.0 to be "less than" +0.0, but the numeric comparison operators think they are equal. This is the purpose of the second check.

Sweeper
  • 213,210
  • 22
  • 193
  • 313
  • 1
    JLS §15.21.1. Numerical Equality Operators == and !=: "the test x!=x is true if and only if the value of x is NaN." – hongsy May 09 '21 at 08:38
18

Just for completeness/clarity, let's draw up a table of all possible outcomes:

  • Either of a and b can be either

    • NaN,
    • −0,
    • 0 (i.e. +0), or
    • some other non-NaN non-zero value, marked as "(other)".

    Writing out all combinations of these for completeness, and distinguishing between positive and negative numbers for clarity in some cases, gives the 20 rows in the table below, though most of them are straightforward and unproblematic.

  • The column titled "Correct min" is the correct value that is supposed to be returned according to the IEEE 754 standard and Java documentation of Math.min, and the column titled "Naive min" is the value that would have been returned if Math.min had been implemented as return a <= b ? a : b; instead.

a b Correct min Naive min Notes on naive min Naive min wrong?
NaN NaN NaN NaN b, as NaN comparison gives false.
NaN −0 NaN −0 b, as NaN comparison gives false. Wrong
NaN 0 NaN 0 b, as NaN comparison gives false. Wrong
NaN (other) NaN (other) b, as NaN comparison gives false. Wrong
−0 NaN NaN NaN b, as NaN comparison gives false.
−0 −0 −0 −0 a, as −0 ≤ −0.
−0 0 −0 −0 a, as −0 ≤ 0.
−0 (other>0) −0 −0
−0 (other<0) (other<0) (other<0)
0 NaN NaN NaN b, as NaN comparison gives false.
0 −0 −0 0 a, as "0 ≤ −0" per IEEE 754. Wrong
0 0 0 0 a, as 0 ≤ 0.
0 (other>0) 0 0
0 (other<0) (other<0) (other<0)
(other) NaN NaN NaN b, as NaN comparison gives false.
(other<0) −0 (other<0) (other<0)
(other>0) −0 −0 −0
(other<0) 0 (other<0) (other<0)
(other>0) 0 0 0
(other) (other) (other) (other)

[The "(other)" in the last row for "Correct min" and "Naive min" means the correct minimum, in the straightforward sense without any confusion because of NaN or −0.]

So you see there are four rows in the table above in which the naive function would give a wrong answer:

  • three of them are the case when a is NaN, but b is not. This is what the first check in the function is for.

  • the other is the case where Math.min(0, -0) is documented by Java as returning −0, even though IEEE 754 treats 0 and −0 as equal for comparison (and therefore the comparison "0 ≤ −0" evaluates as true). This is what the second check in the function is for.

user3840170
  • 26,597
  • 4
  • 30
  • 62
ShreevatsaR
  • 38,402
  • 17
  • 102
  • 126
  • 1
    (This question may be simple enough to reason about without drawing up such an elaborate table, but writing [such tables](https://hillelwayne.com/decision-tables/) can be good practice and can sometimes help clarify one's thinking.) – ShreevatsaR May 07 '21 at 19:28
11

I can help you on the first comparison if (a != a). This obviously only looks at a, so in which cases might a be the minimum regardless of b?

float numbers differ from int by having special values, for example NAN. And one special property of NAN is that a comparison is always false. So the first condition returns a if every comparison operator returns false on a.

The same condition for b can be found in the last line. If a comparison on b always returns false, the last line always returns b.

On the second condition I can only guess that this is related to "negative zero" and "positive zero", another two special values of float. And of course, a negative zero is smaller than a positive zero.

Mathias
  • 282
  • 1
  • 16