3

I am executing below code snippet

System.out.println(List.of(1, 2).getClass());  
System.out.println(List.of(1, 2, 3).getClass());

output of this code is;

class java.util.ImmutableCollections$List2  
class java.util.ImmutableCollections$ListN

I am expecting java.util.ImmutableCollections$List3 as output for the second statement because there is of() method which takes three parameter, Why java creating ImmutableCollections$ListN but not ImmutableCollections$List3?

Edited: It is Java-9 question. There are total 11 overloaded of() methods in List interface each of them takes variable number of parameters from zero to 10 and eleventh one takes varargs to handle N list. So I am expecting List0 to List10 implementation for first 10 overloaded methods, but it is returning ListN with three parameters. Yes, it is implementation detail but just curious to know more information of this.

Pradeep
  • 419
  • 5
  • 14
  • 22
    It's an implementation detail - it's worth asking yourself why you'd *expect* anything in particular. In a future version, the second could return a List3, or the first could return a ListN. – Jon Skeet Mar 21 '18 at 10:21
  • 3
    Do you know *why* the call in the first case returns a `ImmutableCollections$List2`? What do you expect when calling it with 200 arguments? Are you (incorrectly) expecting `ImmutableCollections$List200` to be returned? – luk2302 Mar 21 '18 at 10:22
  • 1
    It seems that the creators of Java decided to provide only the classes `ImmutableCollections.List0`, `ImmutableCollections.List1`, `ImmutableCollections.List2` and `ImmutableCollections.ListN` – Thomas Kläger Mar 21 '18 at 10:30
  • 6
    FWIW, in Java 11 `List0`, `List1` and `List2` will be gone. There'll be just a `List12` and a `ListN`. Reference: https://bugs.openjdk.java.net/browse/JDK-8193128 – Stefan Zobel Mar 22 '18 at 23:05
  • @StefanZobel Right, and `ListN` will be used for an empty list – ZhekaKozlov Mar 28 '18 at 10:45

5 Answers5

14

The main reason to have several different private implementations of List is to save space.

Consider an implementation that stores its elements in an array. (This is essentially what ListN does.) In Hotspot (64-bit with compressed object pointers, each 4 bytes) each object requires a 12-byte header. The ListN object has a single field containing the array, for a total of 16 bytes. An array is a separate object, so that has another 12-byte header plus a 4-byte length. That's another 16 bytes, not counting any actual elements stored. If we're storing two elements, they take 8 bytes. That brings the total to 40 bytes for storing a two-element list. That's quite a bit of overhead!

If we were to store the elements of a small list in fields instead of an array, that object would have a header (12 bytes) plus two fields (8 bytes) for a total of 20 bytes -- half the size. For small lists, there's a considerable savings with storing elements in fields of the List object itself instead of in an array, which is a separate object. This is what the old List2 implementation did. It's recently been superseded by the List12 implementation, which can store lists of one or two elements in fields.

Now, in the API there are 12 overloaded List.of() methods: zero to ten fixed args plus varargs. Shouldn't there be corresponding List0 through List10 and ListN implementations?

There could be, but there doesn't necessarily have to be. An early prototype of these implementations had the optimized small list implementations tied to the APIs. So the zero, one, and two fixed arg of() methods created instances of List0, List1, and List2, and the varargs List.of() method created an instance of ListN. This was fairly straightforward, but it was quite restrictive. We wanted to be able to add, remove, or rearrange implementations at will. It's considerably more difficult to change APIs, since we have to remain compatible. Thus, we decided to decouple things so that the number of arguments in the APIs was largely independent of the implementation instantiated underneath.

In JDK 9 we ended up with the 12 overloads in the API, but only four implementations: field-based implementations holding 0, 1, and 2 elements, and an array-based implementation holding an arbitrary number. Why not add more field-based implementations? Diminishing returns and code bloat. Most lists have few elements, and there's an exponential dropoff in the occurrences of lists as the number of elements gets larger. The space savings get relatively smaller compared to an array-based implementation. Then there's the matter of maintaining all those extra implementations. Either they'd have to be entered directly in the source code (bulky) or we'd switch over to a code generation scheme (complex). Neither seemed justified.

Our startup performance guru Claes Redestad did some measurements and found that there was a speedup in having fewer list implementations. The reason is megamorphic dispatch. Briefly, if the JVM is compiling the code for a virtual call site and it can determine that only one or two different implementations are called, it can optimize this well. But if there are many different implementations that can be called, it has to go through a slower path. (See this article for Black Magic details.)

For the list implementations, it turns out that we can get by with fewer implementations without losing much space. The List1 and List2 implementations can be combined into a two-field List12 implementation, with the second field being null if there's only one element. We only need one zero-length list, since it's immutable! For a zero-length list, we can get rid of List0 just use a ListN with a zero-length array. It's bigger than an old List0 instance, but we don't care, since there's only one of them.

These changes just went into the JDK 11 mainline. Since the API is completely decoupled from the implementations, there is no compatibility issue.

There are additional possibilities for future enhancements. One potential optimization is to fuse an array onto the end of an object, so the object has a fixed part and a variable-length part. This will avoid the need for an array object's header, and it will probably improve locality of reference. Another potential optimization is with value types. With value types, it might be possible to avoid heap allocation entirely, at least for small lists. Of course, this is all highly speculative. But if new features come along in the JVM, we can take advantage of them in the implementations, since they're are entirely hidden behind the API.

Stefan Zobel
  • 3,182
  • 7
  • 28
  • 38
Stuart Marks
  • 127,867
  • 37
  • 205
  • 259
8

ListN is the all-purpose version. List2 is an optimised implementation. There is no such optimised implementation for a list with three elements.

There currently exist* optimised versions for lists and sets with zero, one and two elements. List0, List1, List2, Set0 etc...

There's also an optimised implementation for an empty map, Map0, and for a map containing a single key-value pair, Map1.

Discussion relating to how these implementations are able to provide performance improvements can been seen in JDK-8166365.


*bear in mind this is an implementation detail which may be subject to change, and actually is due to change fairly soon

Michael
  • 41,989
  • 11
  • 82
  • 128
  • 5
    "bear in mind this is an implementation detail which may be subject to change". Yes, it will change. They are going to replace List1 and List2 with a single List12 implementation: http://mail.openjdk.java.net/pipermail/core-libs-dev/2018-March/052183.html – ZhekaKozlov Mar 22 '18 at 05:13
  • @ZhekaKozlov Very interesting. Thanks! I've edited that into my answer – Michael Mar 22 '18 at 08:22
3

Neither ImmutableCollections$List2 nor ImmutableCollections$ListN is generated at runtime. There are four classes already written:

static final class List0<E> extends AbstractImmutableList<E> { ... }
static final class List1<E> extends AbstractImmutableList<E> { ... }
static final class List2<E> extends AbstractImmutableList<E> { ... }
static final class ListN<E> extends AbstractImmutableList<E> { ... }

Starting with of(E e1, E e2, E e3) and up to of(E e1, ..., E e10) an instance of ImmutableCollections.ListN<> is going to be created.

Why java creating ImmutableCollections$ListN but not ImmutableCollections$List3?

The designers have probably decided that 3 and N cases are similar and it's not worth writing a separate class for 3. Apparently, they won't get enough benefits from $List3, $List7, $List10 as they have got from the $List0, $List1, and $List2 versions. They are specifically-optimised.

Currently, 4 classes cover 10 methods. If they decided to add some more methods (e.g. with 22 arguments), there would still be these 4 classes. Imagine you are writing 22 classes for 22 methods. How much unnecessary code duplication would it involve?

Andrew Tobilko
  • 48,120
  • 14
  • 91
  • 142
  • and regarding your last point, you've gotta draw the line somewhere. If they were to add a `List3` then why not a `List4`, 5, 6...? – Michael Mar 21 '18 at 10:42
  • Could you please clarify why this is not a duplicate of the previously marked question? The answer here adds nothing more to the answer made by ZhekaKozlov on the existing thread. – Naman Mar 22 '18 at 07:30
  • @nullpointer, OP already knows that `ListX` types exist and their instances are returned from `List.of` – Andrew Tobilko Mar 22 '18 at 09:59
1

Those are both classes that are being returned. i.e. there is a separate class for ImmutableCollections$List2 and ImmutableCollections$ListN (the $ indicates an inner class)

This is an implementation detail, and (presumably) List2 exists for (possibly) some optimisation reason. I suspect if you look at the source (via your IDE or similar) you'll see two distinct inner classes.

Brian Agnew
  • 268,207
  • 37
  • 334
  • 440
1

As Jon Skeet rightly mentioned, it is an implementation detail. The specification of List.of says that it returns an immutable List, and that's all that matters.

The developers probably decided that they could provide efficient implementations of one-element (List1) and two-element lists (List2), and that all other sizes could be handled by a single type (ListN). This could change at some point in the future - maybe they will introduce a List3 at some point, maybe not.

As per the rules of polymorphism and encapsulation, none of this matters. As long as the returned object is a List, you should not concern yourself with its actual implementation.

Leo Aso
  • 11,898
  • 3
  • 25
  • 46