15

This question was inducted by this StackOverflow question about unsafe casts: Java Casting method without knowing what to cast to. While answering the question I encountered this behaviour I couldn't explain based on purely the specification

I found the following statement in The Java Tutorials at the Oracle docs:

It is not explained what "if necessary" means exactly, and I've found no mention about these casts in the Java Language Specification at all, so I started to experiment.

Let's look at the following piece of code:

// Java source
public static <T> T identity(T x) {
    return x;
}
public static void main(String args[]) {
    String a = identity("foo");
    System.out.println(a.getClass().getName());
    // Prints 'java.lang.String'

    Object b = identity("foo");
    System.out.println(b.getClass().getName());
    // Prints 'java.lang.String'
}

Compiled with javac and decompiled with the Java Decompiler:

// Decompiled code
public static void main(String[] paramArrayOfString)
{
    // The compiler inserted a cast to String to ensure type safety
    String str = (String)identity("foo");
    System.out.println(str.getClass().getName());

    // The compiler omitted the cast, as it is not needed
    // in terms of runtime type safety, but it actually could
    // do an additional check. Is it some kind of optimization
    // to decrease overhead? Where is this behaviour specified?
    Object localObject1 = identity("foo");
    System.out.println(localObject1.getClass().getName());
}

I can see that there is a cast which ensures type safety in the first case, but in the second case it is omitted. It is fine of course, because I want to store the return value in an Object typed variable, so the cast is not strictly necessary as per type safety. However it leads to an interesting behaviour with unsafe casts:

public class Erasure {
    public static <T> T unsafeIdentity(Object x) {
        return (T) x;
    }

    public static void main(String args[]) {
        // I would expect c to be either an Integer after this
        // call, or a ClassCastException to be thrown when the
        // return value is not Integer
        Object c = Erasure.<Integer>unsafeIdentity("foo");
        System.out.println(c.getClass().getName());
        // but Prints 'java.lang.String'
    }
}

Compiled and decompiled, I see no type cast to ensure correct return type at runtime:

// The type of the return value of unsafeIdentity is not checked,
// just as in the second example.
Object localObject2 = unsafeIdentity("foo");
System.out.println(localObject2.getClass().getName());

This means that if a generic function should return an object of a given type, it is not guaranteed it will return that type ultimately. An application using the above code will fail at the first point where it tries to cast the return value to an Integer if it does so at all, so I feel like it breaks the fail-fast principle.

What are the exact rules of the compiler inserting this cast during compilation that ensures type safety and where are those rules specified?

EDIT:

I see that the compiler will not dig into the code and try to prove that the generic code really returns what it should, but it could insert an assertation, or at least a type cast (which it already does in specific cases, as seen in the first example) to ensure correct return type, so the latter would throw a ClassCastException:

// It could compile to this, throwing ClassCastException:
Object localObject2 = (Integer)unsafeIdentity("foo");
Community
  • 1
  • 1
Tamas Hegedus
  • 28,755
  • 12
  • 63
  • 97
  • I'm not an expert on this topic, but I don't think the compiler can do any checking in this case because (1) when it sees the line `return (T) x;`, it has no way to know statically that `x` can't be converted to `T`; and (2) when you actually call `unsafeIdentity`, the compiler can't know that this will fail _because it will not delve into the code of the method and look for statements that will fail_. Basically, I think this means that the cast to `(T)` in the method is useless. – ajb Jan 02 '16 at 04:16
  • Thanks @ajb, of course that cast to (T) is useless, it is really a minimalist example. But it could easily compile the outer function to `Object o = (Integer)unsafeIdentity("foo");`, and that would throw a `ClassCastException` or am I missing something? – Tamas Hegedus Jan 02 '16 at 04:21
  • I don't think that the compiler will / must ever insert an assertion unless you code it so why should it do it here? But the remainder of the question is interesting, +1. – 5gon12eder Jan 02 '16 at 04:31
  • 1
    OK, I see--the method is declared as returning a `T`, so I can see how the compiler might be able to add this check without reading the code of the method. But that would add unnecessary overhead in the vast majority of cases, including lots of `Collections` classes where, say, a `get()` method returns a generic type. That's probably an unacceptable tradeoff. – ajb Jan 02 '16 at 04:34
  • Did you check if the compiler creates a bridge method? – user1803551 Jan 02 '16 at 06:39
  • 1
    I don't fully understand the question. Why should it insert a cast? It clearly warns about an unchecked cast when compiling the `unsafeIdentity` method, and from that on, there are no guarantees about the type anyhow. However, I think that http://hg.openjdk.java.net/jdk8/jdk8/langtools/file/756ae3791c45/src/share/classes/com/sun/tools/javac/jvm/Gen.java#l2328 might be relevant here, as it clearly says that it simply does not insert the cast when it is not necessary (and in fact, this may even be the answer to your question - but I'm not sure) – Marco13 Jan 02 '16 at 11:48
  • "An application using the above code will fail at the first point where it tries to cast the return value to an `Integer` if it does so at all, so I feel like it breaks the fail-fast principle." When a cast fails at runtime, it *always* fails only when it actually tries the cast. When they intend to detect it early, they do so during compile-time. – Olathe Jan 02 '16 at 12:21
  • @Marco13 I'm not saying it should, I just say it could. The piece of source is promising, I have never dug into the OpenJDK source until now. So you say that this piece of behaviour is not specified and is implementation dependent? – Tamas Hegedus Jan 02 '16 at 20:25
  • @Olathe Exactly! That's why I think the compiler could enforce an implicit cast on generic return values. – Tamas Hegedus Jan 02 '16 at 20:39
  • I added a reference to another stack overflow question, which made me think about this topic originally – Tamas Hegedus Jan 02 '16 at 21:07

3 Answers3

6

If you can't find it in the specification, that means it's not specified, and it is up to the compiler implementation to decide where to insert casts or not, as long as the erased code meets the type safety rules of non-generic code.

In this case, the compiler's erased code looks like this:

public static Object identity(Object x) {
    return x;
}
public static void main(String args[]) {
    String a = (String)identity("foo");
    System.out.println(a.getClass().getName());

    Object b = identity("foo");
    System.out.println(b.getClass().getName());
}

In the first case, the cast is necessary in the erased code, because if you removed it, the erased code wouldn't compile. This is because Java guarantees that what is held at runtime in a reference variable of reifiable type must be instanceOf that reifiable type, so a runtime check is necessary here.

In the second case, the erased code compiles without a cast. Yes, it will also compile if you added a cast. So the compiler can decide either way. In this case, the compiler decided not to insert a cast. That is a perfectly valid choice. You should not rely on the compiler to decide either way.

newacct
  • 119,665
  • 29
  • 163
  • 224
  • 1
    Thank you very much, this clearly answers my question. I hoped somebody could find something relevant in the spec that I couldn't, but it looks like the only mention about this is in the java tutorials article about type erasure, mention in JLS 15.5 (found by @HopefullyHelpful) , and the OpenJDK source (found by @Marco13) – Tamas Hegedus Jan 02 '16 at 20:54
-1

Version 1 is preferable because it fails at compiletime.

Typesafe version 1 non-legacy code:

class Erasure {
public static <T> T unsafeIdentity(T x) {
    //no cast necessary, type checked in the parameters at compile time
    return x;
}

public static void main(String args[]) {
    // This will fail at compile time and you should use Integer c = ... in real code
    Object c = Erasure.<Integer>unsafeIdentity("foo");
    System.out.println(c.getClass().getName());
   }
}

Typesafe version 2 legacy code (A run-time type error [...] In an automatically generated cast introduced to ensure the validity of an operation on a non-reifiable type and reference type casting):

class Erasure {
public static <T> T unsafeIdentity(Object x) {
    return (T) x;
    //Compiled version: return (Object) x; 
    //optimised version: return x;
}

public static void main(String args[]) {
    // This will fail on return, as the returned Object is type Object and Subtype Integer is expected, this results in an automatic cast and a ClassCastException:
    Integer c = Erasure.<Integer>unsafeIdentity("foo");
    //Compiled version: Integer c = (Integer)Erasure.unsafeIdentity("foo");
    System.out.println(c.getClass().getName());
   }
}

TypeSafe version 3 legacy code, Methods where you know a supertype everytime (JLS The erasure of a type variable (§4.4) is the erasure of its leftmost bound.):

class Erasure {
public static <T extends Integer> T unsafeIdentity(Object x) {
    // This will fail due to Type erasure and incompatible types:
    return (T) x;
    // Compiled version: return (Integer) x;
}

public static void main(String args[]) {
    //You should use Integer c = ...
    Object c = Erasure.<Integer>unsafeIdentity("foo");
    System.out.println(c.getClass().getName());
   }
}

Object was only used to illustrate that Object is a valid assignment target in version 1 and 3, but you should use the real type or the generic type if possible.

If you use another version of java you should look at the particular pages of the specification, I don't expect any changes.

HopefullyHelpful
  • 1,652
  • 3
  • 21
  • 37
  • All that you write is in fact true, but it still doesn't answer the question – Tamas Hegedus Jan 02 '16 at 05:42
  • "What are the exact rules of the compiler inserting this cast during compilation that ensures type safety and where are those rules specified?" I expect to see either a reference to the corresponding JLS paragraph if any, or some educated guess deducted from decompilation of multiple (occasionally edge-case) scenarios – Tamas Hegedus Jan 02 '16 at 06:20
  • I added references, I only found an indirect reference to the automatic introduction, but after type erasure the type of the righthand statement in version 2 is Object, due to type erasure. This is also stated in both generics Tutorials. – HopefullyHelpful Jan 02 '16 at 07:03
-1

I can't explain it very well, but the comment can't add code as well as I want,so I add this answer. Just hope this answer can help your understanding.The comment can't add code as well as I want.

In your code:

public class Erasure {
    public static <T> T unsafeIdentity(Object x) {
        return (T) x;
    }

    public static void main(String args[]) {
        // I would expect it to fail:
        Object c = Erasure.<Integer>unsafeIdentity("foo");
        System.out.println(c.getClass().getName());
        // but Prints 'java.lang.String'
    }
}

It will erasure Generics after compile time. At compile time, the Erasure.unsafeIdentity has not errors. The jvm erasure Generics depend on the Generics params you give(Integer). After that, the function is like this?:

public static Integer unsafeIdentity(Object x) {
    return x;
}

In fact, the covariant returns will add Bridge Methods:

public static Object unsafeIdentity(Object x) {
    return x;
}

If the function is like last one, do you think the code in your main method will compile fail? It has no errors.Generics Erasure will not add cast in this function, and the return params is not the indentity of java function.

My explanation is a bit farfetched, but hope can help you to understand.

Edit:

After google about that topic, I guess your problems is covariant return types using bridge methods. BridgeMethods

xxxzhi
  • 471
  • 3
  • 12