5

I do not understand how the compiler handle's the following code as it outputs Test while I was expecting an error.

List<Integer> b = new ArrayList<Integer>();
List a = b;
a.add("test");
System.out.println(b.get(0));

I was hoping someone could tell me the exact steps the compiler goes through when executing the code so I can understand the output. My current understanding is that:

  1. The compiler checks during compile time if an add method that supports the argument type exists in List class which is add(Object e) as its raw-typed.
  2. However, during runtime it tries to invoke add(Object e) from the actual object List<Integer> which doesn't hold this method as the actual object is not raw-typed and instead holds the method add(Integer e).

If there is no add(Object e) method in in the actual object List<Integer> how does it still somehow add a String to the List of Integers?

John
  • 71
  • 3
  • The list does not perform a typecheck. After all this is just an array of references, so yes, it will work, but it is unsafe! – Willem Van Onsem Oct 06 '18 at 16:32
  • The list `a` is raw, meaning that it stores `Object` references. As such, you can add any Java class to it, since everything implicitly extends `Object`. The error would come into play if you tried adding a `String` to the `b` list. This would fail at compile time, as generics would kick in to prevent this from happening. – Tim Biegeleisen Oct 06 '18 at 16:34
  • Nice question, you're quite close. Your item 1 is spot on, but Java has something called [type erasure](https://www.baeldung.com/java-type-erasure) so your item 2 does not actually apply. – Ray Toal Oct 06 '18 at 16:34
  • 1
    *If there is no add(Object e) method in in the actual object List* That's where you're mistaken. It's a `List`, at runtime it's a `List` so there **is** an `add(Object)`. The compiler inserts a type-cast, and that is what will fail. – Elliott Frisch Oct 06 '18 at 16:36
  • Generics are a compile-time check and by doing `List a = b` you are turning off that compile-time check. – Peter Lawrey Oct 06 '18 at 17:23
  • same course as https://stackoverflow.com/q/52672926/85421 ? – user85421 Oct 06 '18 at 17:38
  • See: [What is a raw type and why shouldn't we use it?](https://stackoverflow.com/questions/2770321/what-is-a-raw-type-and-why-shouldnt-we-use-it) – Jesper Oct 06 '18 at 18:32

2 Answers2

4

You are quite close. The compile time checks all pan out:

a is of type List so the call

a.add("test");

pans out. b is of (compile-time) type ArrayList<Integer> so

b.get(0)

checks out as well. Note that the checks are made only against the compile-time types of the variables. When the compiler sees a.add("test") it does not know the run time value of the object referenced by variable a. In general, it really can't (there's a result in theoretical computer science about this), though control-flow type analysis can catch many such things. Languages like TypeScript can do amazing things at compile time.

Now you might assume that at run-time such things could be checked. Alas, in Java they cannot. Java erases generic types. Find an article on Java type erasure for the gory details. The TL;DR is that a List<Integer> at compile time becomes a raw List at run time. The JVM did not have a way to "reify" generics (though other languages do!) so when generics were introduced, the decision was made that Java would just erase the generic types. So at run time, there is no type problem in your code.

Let's take a look at the compiled code:

   0: new           #2                  // class java/util/ArrayList
   3: dup
   4: invokespecial #3                  // Method java/util/ArrayList."<init>":()V
   7: astore_1
   8: aload_1
   9: astore_2
  10: aload_2
  11: ldc           #4                  // String test
  13: invokeinterface #5,  2            // InterfaceMethod java/util/List.add:(Ljava/lang/Object;)Z
  18: pop
  19: getstatic     #6                  // Field java/lang/System.out:Ljava/io/PrintStream;
  22: aload_1
  23: iconst_0
  24: invokeinterface #7,  2            // InterfaceMethod java/util/List.get:(I)Ljava/lang/Object;
  29: invokevirtual #8                  // Method java/io/PrintStream.println:(Ljava/lang/Object;)V
  32: return

Here you can see directly that there are no run-time type checks. So, the complete (but seemingly flippant) answer to your question is that Java only checks types at compile time based on the types of the variables (known at compile time), but generic type parameters are erased and the code is run without them.

Ray Toal
  • 86,166
  • 18
  • 182
  • 232
  • Thank you for your comment. I still don't 100% understand it due to the fact that I need to take a look at the way the Java compiler handles objects compile time and run time. Once I understand this fully, I bet your post will make a lot more sense to me. The type erasure is also something that will help a lot. – John Oct 06 '18 at 18:31
  • Agreed that it is non-intuitive. The most important part to understand is that when the compiler sees `a.add(test)` the compiler only asks _what is the declared type of a?_ Because `a` is declared as a `List`, the compiler says this okay. Even though the _object_ that `a` references is a `List`, the compiler does not in general know that. It only typechecks expressions based on the declared type, not on what values will be there at runtime. "Type erasure" explains why the program runs, but as to why it compiles, the reason is `a` is _declared_ with type `List` only. Happy studying! – Ray Toal Oct 06 '18 at 21:24
0

The surprise here is that b.get(0) does not have a runtime check. We'd expect the code to be interpreted by the compiler to mean something like:

System.out.println((Integer)b.get(0)); // throws CCE

Indeed if we were to try:

Integer str = b.get(0); // throws CCE

we'd get a runtime ClassCastException.

Indeed we'd even get the same error switching printf in place of println:

System.out.printf(b.get(0)); // throws CCE

How does that make any sense?

It's a mistake that can't be fixed because of backward compatibility. If the target context could allow the removal of the check cast, then it is elided despite changing the semantics. And in this case the overload changes from println(Integer) to println(Object). Worse than this, there is an overload println(char[]) which has different behaviour!

Anyway, don't use raw or rare types, don't overload to change behaviour (or overload at all if you can manage it) and take real care before committing an optimisation to an irreparable spec.

Tom Hawtin - tackline
  • 145,806
  • 30
  • 211
  • 305