19

I was reading about varargs heap pollution and I don't really get how varargs or non-reifiable types would be responsible for problems that do not already exist without genericity. Indeed, I can very easily replace

public static void faultyMethod(List<String>... l) {
    Object[] objectArray = l; // Valid
    objectArray[0] = Arrays.asList(42);
    String s = l[0].get(0); // ClassCastException thrown here
}

with

public static void faultyMethod(String... l) {
    Object[] objectArray = l; // Valid
    objectArray[0] = 42;  // ArrayStoreException thrown here
    String s = l[0];
}

The second one simply uses the covariance of arrays, which is really the problem here. (Even if List<String> was reifiable, I guess it would still be a subclass of Object and I would still be able to assign any object to the array.) Of course I can see there's a little difference between the two, but this code is faulty whether it uses generics or not.

What do they mean by heap pollution (it makes me think about memory usage but the only problem they talk about is potential type unsafetiness), and how is it different from any type violation using arrays' covariance?

Peter Lawrey
  • 525,659
  • 79
  • 751
  • 1,130
Dici
  • 25,226
  • 7
  • 41
  • 82
  • Good question, allow me to add that the line objectArray[0] = 42; is the one actually throwing the ArrayStoreException. – Victor Aug 29 '15 at 23:15

5 Answers5

14

You're right that the common (and fundamental) problem is with the covariance of arrays. But of those two examples you gave, the first is more dangerous, because can modify your data structures and put them into a state that will break much later on.

Consider if your first example hadn't triggered the ClassCastException:

public static void faultyMethod(List<String>... l) {
  Object[] objectArray = l;           // Valid
  objectArray[0] = Arrays.asList(42); // Also valid
}

And here's how somebody uses it:

List<String> firstList = Arrays.asList("hello", "world");
List<String> secondList = Arrays.asList("hello", "dolly");
faultyMethod(firstList, secondList);
return secondList.isEmpty()
  ? firstList
  : secondList;

So now we have a List<String> that actually contains an Integer, and it's floating around, safely. At some point later — possibly much later, and if it's serialized, possibly much later and in a different JVM — someone finally executes String s = theList.get(0). This failure is so far distant from what caused it that it could be very difficult to track down.

Note that the ClassCastException's stack trace doesn't tell us where the error really happened; it just tells us who triggered it. In other words, it doesn't give us much information about how to fix the bug; and that's what makes it a bigger deal than an ArrayStoreException.

yshavit
  • 42,327
  • 7
  • 87
  • 124
  • Yeah, that is the conclusion I came to reading the two other answers, which don't seem to state it as explicitly as yours but helped me to understand it. Humpf, I'm gonna have a hard time choosing the winner. – Dici Aug 29 '15 at 23:36
  • @yshavit, i have studied your answer and found it very interesting, thanks for sharing... i would like to add if you use the method `faultyMethod(List.... l)` ; passing the elements as individual elements like `faultyMethod(firstList, secondList)`or an anonymous array like `faultyMethod(new List[] {firstList, secondList} )` you want any problem outside the method because the object being modified is the array of lists either explicit or implicit (varg args do that for you) that serves as the method sole argument. – Victor Aug 30 '15 at 21:23
  • I mean you would have problems outside the scope of the function if you do things like this: `List[] array = new List[] {firstList, secondList}; faultyMethod(array); System.out.println(array[0] + " " + array[1]);` – Victor Aug 30 '15 at 21:24
  • @Dici to my understanding of things, my advice is try to not mix arrays and generics types. Vargs args even add more confusion to the code. When the unchecked warnings comes to plays.. that means that you should try to change things. That would be the final advice, i read it some time ago in Joshua's book Effective Java... and until now I haven't found a better advice. – Victor Aug 30 '15 at 21:29
8

The difference between an array and a List is that the array checks it's references. e.g.

Object[] array = new String[1];
array[0] = new Integer(1); // fails at runtime.

however

List list = new ArrayList<String>();
list.add(new Integer(1)); // doesn't fail.
Peter Lawrey
  • 525,659
  • 79
  • 751
  • 1,130
  • 2
    Yep, I knew this. I guess what you mean by that is the array is considered safer because I won't be able to trick him at runtime (`ArrayStoreException`) whereas the `List` won't complain, and therefore they generate a warning at compile time ? – Dici Aug 29 '15 at 23:28
  • @Dici yes, this is what I meant. +1 – Peter Lawrey Aug 30 '15 at 08:13
5

From the linked document, I believe what Oracle means by "heap pollution" is to have data values that are technically allowed by the JVM specification, but are disallowed by the rules for generics in the Java programming language.

To give you an example, let's say we define a simple List container like this:

class List<E> {
    Object[] values;
    int len = 0;

    List() { values = new Object[10]; }

    void add(E obj) { values[len++] = obj; }
    E get(int i) { return (E)values[i]; }
}

This is an example of code that is generic and safe:

List<String> lst = new List<String>();
lst.add("abc");

This is an example of code that uses raw types (bypassing generics) but still respects type safety at a semantic level, because the value we added has a compatible type:

String x = (String)lst.values[0];

The twist - now here is code that works with raw types and does something bad, causing "heap pollution":

lst.values[lst.len++] = new Integer("3");

The code above works because the array is of type Object[], which can store an Integer. Now when we try to retrieve the value, it'll cause a ClassCastException - at retrieval time (which is way after the corruption occurred), instead of at add time:

String y = lst.get(1);  // ClassCastException for Integer(3) -> String

Note that the ClassCastException happens in our current stack frame, not even in List.get(), because the cast in List.get() is a no-op at run time due to Java's type erasure system.

Basically, we inserted an Integer into a List<String> by bypassing generics. Then when we tried to get() an element, the list object failed to uphold its promise that it must return a String (or null).

Nayuki
  • 17,911
  • 6
  • 53
  • 80
  • Yes, I'm aware of that but I think with you answer and Peter Lawrey's one I can see why they did it. Arrays will throw an exception right at the time of the insertion whereas a generic list (for example) will only fail when reading the value, which may not happen or happen long after the insertion, making the debug harder. Is it what you meant ? – Dici Aug 29 '15 at 23:33
3

Prior to generics, there was absolutely no possibility that an object's runtime type is inconsistent with its static type. This is obviously a very desirable property.

We can cast an object to an incorrect runtime type, but the cast would fail immediately, at the exact site of casting; the error stops there.

Object obj = "string";
((Integer)obj).intValue();
// we are not gonna get an Integer object

With the introduction of generics, along with type erasure (the root of all evils), now it is possible that a method returns String at compile time, yet returns Integer at runtime. This is messed up. And we should do everything we can to stop it from the source. It is why the compiler is so vocal about every sight of unchecked casts.

The worst thing about heap pollution is that the runtime behavior is undefined! Different compiler/runtime may execute the program in different ways. See case1 and case2.

Matt Fenwick
  • 48,199
  • 22
  • 128
  • 192
ZhongYu
  • 19,446
  • 5
  • 33
  • 61
  • Thanks. Now that I have my mind clearer on this, I suspect bridge methods to be responsible for weird `ClassCastException` I sometimes get when serializing closures with Spark – Dici Aug 30 '15 at 09:24
1

They are different because ClassCastException and ArrayStoreException are different.

Generics compile-time type checking rules should ensure that it's impossible to get a ClassCastException in a place where you didn't put an explicit cast, unless your code (or some code you called or called you) did something unsafe at compile-time, in which case you should (or whatever code did the unsafe thing should) receive a compile-time warning about it.

ArrayStoreException, on the other hand, is a normal part of how arrays work in Java, and pre-dates Generics. It is not possible for compile-time type checking to prevent ArrayStoreException because of the way the type system for arrays is designed in Java.

newacct
  • 119,665
  • 29
  • 163
  • 224