10

I'm studying generics in this period and today I've found this mystery for me.

Let's consider the following dummy class:

    public class Main{

    public static void main(String[] args) { 
        Container<Integer> c = new Container<Integer>(); 

        c.getArray();                      //No Exception
        //c.getArray().getClass();         //Exception
        //int a = c.getArray().length;     //Exception

    } 

}


class Container<T> { 

    T[] array; 

    @SuppressWarnings("unchecked") 
    Container() { 
        array = (T[])new Object[1]; 
    } 

    void put(T item) { 
        array[0] = item; 
    } 

    T get() { return array[0]; } 

    T[] getArray() { return array; }
}  

Because of erasure, at runtime, the T[] return type of the getArray() method is turned into a Object[], which is completely reasonable to me.

If we access that method as it is (c.getArray()) no Exceptions are thrown, but if we try to call some methods on the returned array, for example c.Array().getClass(), or if we try to access to a field, for example c.getArray().length, then the following exception is thrown:

Exception in thread "main" java.lang.ClassCastException: [Ljava.lang.Object; cannot be cast to [Ljava.lang.Integer;

Why is this Exception thrown? Why is it not thrown also for the simple c.getArray() call? Why is it trying to cast to Integer[] if we are simply calling getClass() or accessing length? Are getClass() and length not available also for Object[]?

Thanks in advance for your many (I hope) and explanatory (I hope this too) answers.

acejazz
  • 789
  • 1
  • 7
  • 27
  • In order to dereference the value of `c.getArray()`, a reference to it has to be stored on the stack temporarily. I can imagine that JLS says - somewhere, still looking - that this temporary variable has to be checked to see if it is the unerased type (since you know the unerased type there). – Andy Turner Jul 08 '16 at 13:09
  • 1
    Indeed, it works if you do the dereferencing in a method like `static void foo(Container c) { c.getArray().getClass(); }` – Andy Turner Jul 08 '16 at 13:11
  • There is an interesting difference in the bytecode if you change `` to ``: with `Integer`, there is a `checkcast` instruction (the reason for the `ClassCastException`); with `Object`, no `checkcast` instruction is added. Understandable, since all `T[]` can be cast to `Object[]`; just faintly surprising that it results in different bytecode. – Andy Turner Jul 08 '16 at 13:18
  • @Matsemann because I don't actually have an answer :) I'm just making some observations. – Andy Turner Jul 08 '16 at 13:35
  • 1
    The cast to `Integer[]` at the call site of `getArray()` is no different, linguistically, than the cast to `Integer` at the call site of `get()`. (This, BTW, is why `Collection.toArray()` returns `Object[]` rather than `T[]` -- it doesn't want to lie to you and say it's returning an `Integer[]` when really it's returning an `Object[]`.) – Brian Goetz Jul 08 '16 at 17:03

3 Answers3

2

The reason for the exception is that the compiler expects a Integer[] but receives an Object[]. It added a run-time cast - at the call sites of getArray. Those casts discovered the lying, dummy, no-effect cast in your constructor.

For it to be correct, one needs the actual class of T, in order to create instances.

@SuppressWarnings("unchecked")
Container(Class<T> type) {
    array = (T[]) Array.newInstance(type, 10);
}


    Container<Integer> c = new Container<Integer>(Integer.class); 

    c.getArray();
    Class<?> t = c.getArray().getClass();
    System.out.println(t.getName());
    int a = c.getArray().length;

Also here remains an "unsafe" cast to T[] but this is unavoidable as Array.newInstance is a low-level method for n-dimensional arrays like in:

(double[][][][][][]) Array.newInstance(double.class, 3, 3, 3, 3, 3, 6);
Joop Eggen
  • 107,315
  • 7
  • 83
  • 138
  • But I cannot understand why it doesn't complain for c.getArray() but it does for c.getArray().getClass(). What drives me crazy is that c.getArray() is fine while c.getArray().getClass() is not. Why is the compiler complaining only for the second one? In case of c.getArray() the non working cast should happen too. – acejazz Jul 08 '16 at 16:00
  • You do not assign g.getArray(). – Joop Eggen Jul 10 '16 at 07:32
1

I have not been able to find the exact place in the JLS which says this is the behaviour, but I think the reason is something like this:

The expression:

c.getArray().getClass();

is roughly equivalent to:

Integer[] arr = (Integer[]) c.getArray();
arr.getClass();

where the cast has to be added because of type erasure. This implicit cast adds a checkcast instruction in the bytecode, which fails with a ClassCastException, since c.getArray() is of type Object[].

Looking at the bytecode for:

static void implicit() {
  Container<Integer> c = new Container<Integer>();
  c.getArray().getClass(); //Exception
}

static void explicit() {
  Container<Integer> c = new Container<Integer>();
  Integer[] arr = (Integer[]) c.getArray();
  arr.getClass(); //Exception
}

we get:

  static void implicit();
    Code:
       0: new           #2                  // class Container
       3: dup
       4: invokespecial #3                  // Method Container."<init>":()V
       7: astore_0
       8: aload_0
       9: invokevirtual #4                  // Method Container.getArray:()[Ljava/lang/Object;
      12: checkcast     #5                  // class "[Ljava/lang/Integer;"
      15: invokevirtual #6                  // Method java/lang/Object.getClass:()Ljava/lang/Class;
      18: pop
      19: return

  static void explicit();
    Code:
       0: new           #2                  // class Container
       3: dup
       4: invokespecial #3                  // Method Container."<init>":()V
       7: astore_0
       8: aload_0
       9: invokevirtual #4                  // Method Container.getArray:()[Ljava/lang/Object;
      12: checkcast     #5                  // class "[Ljava/lang/Integer;"
      15: checkcast     #5                  // class "[Ljava/lang/Integer;"
      18: astore_1
      19: aload_1
      20: invokevirtual #6                  // Method java/lang/Object.getClass:()Ljava/lang/Class;
      23: pop
      24: return

So the only difference in the explicit version are the three instructions:

      15: checkcast     #5                  // class "[Ljava/lang/Integer;"
      18: astore_1
      19: aload_1

Which are there only because of explicitly storing this in a variable, as far as I understand it.

Andy Turner
  • 137,514
  • 11
  • 162
  • 243
  • The wrong cast happens in `Integer[] arr = (Integer[]) c.getArray();`, right? So, if `c.getArray().getClass()` fails, `c.getArray()` should fail similarly because the cast is performed before the `getClass()` invocation. – acejazz Jul 08 '16 at 16:05
  • 1
    No, I don't think so - it's the dereferencing of `c.getArray()` which seems to cause the issue. For example, `System.out.println(c.getArray())` works fine; but that's invoking the `System.out.println(Object)` overload, which doesn't require the cast. – Andy Turner Jul 08 '16 at 16:09
  • That's interesting, thanks. If `System.out.println(c.getArray())` is working fine because is using Object, wouldn't be the same for `c.getArray().getClass()`? I mean, `getClass()` is an Object method, so it should work anyway. Where am I wrong? – acejazz Jul 08 '16 at 16:15
1

When you do an unsafe unchecked cast, it may or may not cause an exception somewhere. You are not guaranteed to get an exception somewhere.

In this case, whether you get an exception depends on whether the compiler inserted a cast in the erased code to cast the result of the call to Integer[]. In this case, it seems a cast was inserted in the second and third case but not the first case.

In each of the three cases, the compiler is allowed to insert a cast (because it is allowed to assume that the result is Integer[] or not insert a cast (because the expression is used in such a way that only requires Object[] in all three). Whether to insert a cast or not is up to the particular compiler implementation to decide.

Why would this compiler not insert a cast in the first case and insert a cast in the second and third cases? One obvious explanation would be that in the first case, the result is obviously unused, so it is very simple to determine that a cast is unnecessary. In the second and third cases, to determine that a cast is unnecessary would require looking at how the expression is used to see that the it will also work with Object[]; and this is a rather complicated analysis. The compiler authors probably opted for a simple approach where they skip the cast only when the result is unused.

Another compiler might insert casts in all three cases. And another compiler might have no casts in all three cases. You cannot rely on it.

newacct
  • 119,665
  • 29
  • 163
  • 224