9

One of my students is getting a null pointer exception when using a ternary operator which sometimes results in null. I think I understand the issue, but it seems like it's resulting from an inconsistent type inference. Or to put another way, I feel like the semantics here are inconsistent and the error should be avoidable without changing his approach.

This question is similar to, but different from Another question about ternary operators. In that question, the null Integer must be forced to an int because the return value of the function is an int. However, that is not the case in my student's code.

This code runs fine:

Integer x = (5>7) ? 3 : null;

The value of x is null. No NPE. In this case, the compiler can figure out that the result of the ternary operator needs to be Integer, so it casts the 3 (an int) to an Integer rather than casting null to int.

However, running this code:

Integer x = (5>7) ? 3 : (5 > 8) ? 4 : null;

results in an NPE. The only reason that would happen is because the null gets cast to an int, but that's not really necessary and seems inconsistent with the first bit of code. That is, if the compiler can deduce for the first snipet that the result of the ternary operator is an Integer, why can't it do that in the second case? The result of the second ternary expression must be an Integer, and since that result is the second result for the first ternary operator, the result of the first ternary operator should also be an Integer.

Another snipet works fine:

Integer three = 3;

Integer x = (5>7) ? three : (5 > 8) ? three+1 : null;

Here, the compiler seems to be able to deduce that the result of both ternary operators is an Integer and so doesn't force-cast the null to int.

Community
  • 1
  • 1
David Stigant
  • 121
  • 1
  • 4
  • Null pointer access: This expression of type Integer is null but requires auto-unboxing – Pinkie Swirl Oct 16 '15 at 18:32
  • See also these questions: [Java autoboxing and ternary madness](http://stackoverflow.com/questions/25417438/java-autoboxing-and-ternary-operator-madness) and [NPE through auto-boxing of Java ternary](http://stackoverflow.com/questions/12763983/nullpointerexception-through-auto-boxing-behavior-of-java-ternary-operator). – Andy Thomas Oct 16 '15 at 19:03

3 Answers3

13

The key to this is that the conditional operator is right associative. The rules for determining the result type of a conditional expression are hideously complicated but it boils down to this:

  1. First (5 > 8) ? 4 : null is evaluated, the 2nd operand is an int, the 3rd is null, if we look it up in the table, the result type of this expression is Integer. (In other words: because one of the operands is null, this is treated as a reference conditional expression)
  2. Then we have (5>7) ? 3 : <previous result> to evaluate, which means that in the table linked above, we need to lookup the result type for 2nd operand int and 3rd operand Integer: and it's int. This means that <previous result> needs to be unboxed and fails with an NPE.

So why does the first case work then?

There we've got (5>7) ? 3 : null;, as we've seen if 2nd operand is int and 3rd is null, the result type is Integer. But we assign it to a variable of Integer type so no unboxing is required.

However this only happens with the null literal, the following code will still throw an NPE, because operand types int and Integer result in a numeric conditional expression:

Integer i = null;
Integer x = (5>7) ? 3 : i;

To sum it up: there is a kind of logic to it but it isn't human logic.

  1. If both operands are of compile type Integer, the result is an Integer.
  2. If one operand is of type int and the other is Integer, the result is an int.
  3. If one operand is of null type (of which the only valid value is the null reference), the result is an Integer.
biziclop
  • 48,926
  • 12
  • 77
  • 104
  • Nice summary. Any idea why those tables in the JLS specify that `? int : Integer` and `? Integer : int` map to int rather than Integer? – Andy Thomas Oct 16 '15 at 18:43
  • @AndyThomas One can only guess but I think the idea was that most of the time you are going to assign the result to an `int` variable anyway, and creating an extra `Integer` instance for the result just to discard it straight away was seen as too expensive. But this really is no more than a guess, there could be other, more valid reasons. – biziclop Oct 16 '15 at 18:46
  • Ok, why is an int but in the first example which has the same form as the answer is Integer? – David Stigant Oct 16 '15 at 18:46
  • @DavidStigant `` is an `Integer` in both cases, that's what causes the problem. – biziclop Oct 16 '15 at 18:49
  • But there's only a problem in the second case, not the first one. Or, only the second case results in a NPE, not the first one. – David Stigant Oct 16 '15 at 18:52
  • @DavidStigant Okay, I'll add some more explanation for the first case. – biziclop Oct 16 '15 at 18:54
  • @biziclop - Thank you for speculating. Seems like this surprising behavior could have been avoided. – Andy Thomas Oct 16 '15 at 19:05
  • @AndyThomas Maybe but this is a complicated construct and it's easy to overlook a crucial detail that would've made the boxing solution impossible or infeasible. – biziclop Oct 16 '15 at 19:17
  • Ok, I think I get what's happening, but I still don't understand why the table/rule is that way... It seems like the table result should be to go to the most general type (in this case, Integer vs int ==> Integer) because that is provably error free while the other option is error prone. If there's an efficiency issue, shouldn't the default be to do the non-error thing by default and let the efficiency nuts bend over backwards to get their performance? – David Stigant Oct 16 '15 at 19:21
  • @DavidStigant I have to admit that I don't understand them either. :) They look quite crazy and in certain edge cases very non-intuitive. It may be the result of having to update them several times, first to accommodate boxing/unboxing, then again for lambdas and poly expressions. (And right from the start there had already been the matter of implicit conversion between primitive types.) – biziclop Oct 16 '15 at 19:33
1
int x = (int) (5>7) ? 3 : null;

results in a NPE since null is casted to int.

Same with your code

Integer x = (5>7) ? 3 : (5 > 8) ? 4 : null;

looks like this:

Integer x = (Integer) ((5>7) ? (int) 3 : (5 > 8) ? 4 : null);

As you see null is again casted to int, which fails.

To solve this one could do this:

Integer x = (Integer) ((5>7) ? (Integer) 3 : (5 > 8) ? 4 : null);

Now it does not try to cast null to int, but Integer.

Pinkie Swirl
  • 2,375
  • 1
  • 20
  • 25
1

Before Java8, in almost all cases , the type of an expression is built bottom-up, entirely depending on the types of sub-expressions; it does not depend on the context. This is nice and simple, and code is easy to understand; for example, overload resolution depends on the types of arguments, which are resolved independent of the method invocation context.

( The only exception that I know of is jls#15.12.2.8)

Given a conditional expression in the form of ?int:Integer, the spec needs to define a fixed type for it with no regard to the context. The int type was chosen , which is presumably better in most use cases. Of course, it is also the source of NPE from unboxing.


In Java8, contextual type information may be used in type inference. This is convenient for many cases; but it also introduces confusion, since there may be two directions to resolve the type of an expression. Luckily, some expressions are still stand-alone; their types are context-independent.

w.r.t conditional expressions, we don't want simple ones like false?0:1 to be context-dependent; their types are self-evident. On the other hand, we do want contextual type inference on more complicated conditional expressions, like false?f():g() where f/g() requires type inference.

The line was drawn between primitive and reference types. In op1?op2:op3, if both op2 and op3 are "clearly" primitive types (or boxed versions of), it is treated as stand-alone. Quoting Dan Smith -

We classify conditional expressions here in order to enhance the typing rules of reference conditionals (15.25.3) while preserving existing behavior of boolean and numeric conditionals. If we tried to treat all conditionals uniformly, there would be a variety of unwanted incompatible changes, including changes in overload resolution and boxing/unboxing behavior.

In your case

Integer x = false ? 3 : false ? 4 : null;

since false?4:null is "clearly"(?) an Integer, the parent expression is in the form of ?:int:Integer; this is a primitive case, and its behavior is kept compatible with java7, hence, NPE.


I put quotes on "clearly" because that's my intuitive understanding; I'm not sure about the formal spec. Let's look at this example

static <T> T f1(){ return null; }
--

Integer x = false ? 3 : false ? f1() : null;

It compiles! and there is no NPE at runtime! I don't know how to follow the spec on this case. I can imagine that the compiler probably does the following steps:

1) sub-expression false?f1():null is not "clearly" a (boxed)primitive type; its type is unknown yet

2) therefore, the parent expression is classified as "reference conditional expression", which appears in an assignment context.

3) the target type Integer is applied to the operands, and eventually to f1(), which is then inferred to return Integer

4) however, we cannot go back now to re-classify the conditional expression as ?int:Integer.


That sounds reasonable. However, what if we explicitly specify the type argument to f1()?

Integer x = false ? 3 : false ? Test.<Integer>f1() : null;

Theory (A) - this should not alter the semantics of the program, because it's the same type argument that would have been inferred. We should not see NPE at runtime.

Theory (B) - there is no type inference; the type of sub-expression is clearly Integer, therefore this should be classified as primitive case, and we should see NPE at runtime.

I believe in (B); however, javac(8u60) does (A). I don't understand why.


Pushing this observation to a hilarious level

    class MyList1 extends ArrayList<Integer>
    {
        //inherit public Integer get(int index)
    }

    class MyList2 extends ArrayList<Integer>
    {
        @Override public Integer get(int index)
        {
            return super.get(0);
        }
    }

    MyList1 myList1 = new MyList1();
    MyList2 myList2 = new MyList2();

    Integer x1 = false ? 3 : false ? myList1.get(0) : null;   // no NPE
    Integer x2 = false ? 3 : false ? myList2.get(0) : null;   //    NPE !!!

That doesn't make any sense; something really funky is going on inside javac.

(see also Java autoboxing and ternary operator madness)

Community
  • 1
  • 1
ZhongYu
  • 19,446
  • 5
  • 33
  • 61
  • lesson learned - Always use explicit, identical types in `op2` and `op3`. Never mix `int` and `Integer` here; do manual explicit boxing/unboxing if necessary. Never mix `int` and `long` etc here; do manual primitive type conversion if necessary. This part of spec is too unreliable. – ZhongYu Oct 17 '15 at 02:32