215

Assuming I have an ArrayList

ArrayList<MyClass> myList;

And I want to call toArray, is there a performance reason to use

MyClass[] arr = myList.toArray(new MyClass[myList.size()]);

over

MyClass[] arr = myList.toArray(new MyClass[0]);

?

I prefer the second style, since it's less verbose, and I assumed that the compiler will make sure the empty array doesn't really get created, but I've been wondering if that's true.

Of course, in 99% of the cases it doesn't make a difference one way or the other, but I'd like to keep a consistent style between my normal code and my optimized inner loops...

itsadok
  • 28,822
  • 30
  • 126
  • 171
  • 9
    Looks like the question has now been settled in a new blog post by Aleksey Shipilёv, [Arrays of Wisdom of the Ancients](http://shipilev.net/blog/2016/arrays-wisdom-ancients/)! – glts Jan 19 '16 at 17:47
  • 7
    From the blog post: "Bottom line: toArray(new T[0]) seems faster, safer, and contractually cleaner, and therefore should be the default choice now." – DavidS Feb 24 '16 at 19:17

8 Answers8

153

Counterintuitively, the fastest version, on Hotspot 8, is:

MyClass[] arr = myList.toArray(new MyClass[0]);

I have run a micro benchmark using jmh the results and code are below, showing that the version with an empty array consistently outperforms the version with a presized array. Note that if you can reuse an existing array of the correct size, the result may be different.

Benchmark results (score in microseconds, smaller = better):

Benchmark                      (n)  Mode  Samples    Score   Error  Units
c.a.p.SO29378922.preSize         1  avgt       30    0.025 ▒ 0.001  us/op
c.a.p.SO29378922.preSize       100  avgt       30    0.155 ▒ 0.004  us/op
c.a.p.SO29378922.preSize      1000  avgt       30    1.512 ▒ 0.031  us/op
c.a.p.SO29378922.preSize      5000  avgt       30    6.884 ▒ 0.130  us/op
c.a.p.SO29378922.preSize     10000  avgt       30   13.147 ▒ 0.199  us/op
c.a.p.SO29378922.preSize    100000  avgt       30  159.977 ▒ 5.292  us/op
c.a.p.SO29378922.resize          1  avgt       30    0.019 ▒ 0.000  us/op
c.a.p.SO29378922.resize        100  avgt       30    0.133 ▒ 0.003  us/op
c.a.p.SO29378922.resize       1000  avgt       30    1.075 ▒ 0.022  us/op
c.a.p.SO29378922.resize       5000  avgt       30    5.318 ▒ 0.121  us/op
c.a.p.SO29378922.resize      10000  avgt       30   10.652 ▒ 0.227  us/op
c.a.p.SO29378922.resize     100000  avgt       30  139.692 ▒ 8.957  us/op

For reference, the code:

@State(Scope.Thread)
@BenchmarkMode(Mode.AverageTime)
public class SO29378922 {
  @Param({"1", "100", "1000", "5000", "10000", "100000"}) int n;
  private final List<Integer> list = new ArrayList<>();
  @Setup public void populateList() {
    for (int i = 0; i < n; i++) list.add(0);
  }
  @Benchmark public Integer[] preSize() {
    return list.toArray(new Integer[n]);
  }
  @Benchmark public Integer[] resize() {
    return list.toArray(new Integer[0]);
  }
}

You can find similar results, full analysis, and discussion in the blog post Arrays of Wisdom of the Ancients. To summarize: the JVM and JIT compiler contains several optimizations that enable it to cheaply create and initialize a new correctly sized array, and those optimizations can not be used if you create the array yourself.

Raedwald
  • 46,613
  • 43
  • 151
  • 237
assylias
  • 321,522
  • 82
  • 660
  • 783
  • 2
    Very interesting comment. I'm surprised no one has commented on this. I guess it's because it contradicts the other answers here, as far as speed. Also interesting to note, this guys reputation is almost higher than all the other answers (ers) combined. – Pimp Trizkit Dec 04 '15 at 15:43
  • I digress. I would also like to see benchmarks for `MyClass[] arr = myList.stream().toArray(MyClass[]::new);` .. which I guess would be slower. Also, I would like to see benchmarks for the difference with array declaration. As in the difference between: `MyClass[] arr = new MyClass[myList.size()]; arr = myList.toArray(arr);` and `MyClass[] arr = myList.toArray(new MyClass[myList.size()]);` ... or should there not be any difference? I guess these two are an issue that is outside of the `toArray` functions happenings. But hey! I didn't think I would learn about the other intricate differences. – Pimp Trizkit Dec 04 '15 at 16:02
  • 2
    @PimpTrizkit Just checked: using an extra variable makes no difference as expected, Using a stream takes between 60% and 100% more time as calling `toArray` directly (the smaller the size, the larger the relative overhead) – assylias Dec 04 '15 at 16:49
  • Wow, that was a fast response! Thanks! Yea, I suspected that. Converting to a stream didn't sound efficient. But you never know! – Pimp Trizkit Dec 04 '15 at 16:56
  • 3
    This same conclusion was found here: http://shipilev.net/blog/2016/arrays-wisdom-ancients/ – user167019 Feb 26 '16 at 23:06
  • what about `list.stream().toArray( Integer[]::new )`? – xenoterracide Apr 04 '16 at 18:22
  • 1
    @xenoterracide as discussed in the comments above, streams are slower. – assylias Apr 04 '16 at 21:10
  • I could not understand why IntelliJ was recommending to replace my pre-sized array with a zero-size array. Thank you... now I know! I can see @АнтонАнтонов mentions it below. – kevinarpe Nov 28 '20 at 18:06
122

As of ArrayList in Java 5, the array will be filled already if it has the right size (or is bigger). Consequently

MyClass[] arr = myList.toArray(new MyClass[myList.size()]);

will create one array object, fill it and return it to "arr". On the other hand

MyClass[] arr = myList.toArray(new MyClass[0]);

will create two arrays. The second one is an array of MyClass with length 0. So there is an object creation for an object that will be thrown away immediately. As far as the source code suggests the compiler / JIT cannot optimize this one so that it is not created. Additionally, using the zero-length object results in casting(s) within the toArray() - method.

See the source of ArrayList.toArray():

public <T> T[] toArray(T[] a) {
    if (a.length < size)
        // Make a new array of a's runtime type, but my contents:
        return (T[]) Arrays.copyOf(elementData, size, a.getClass());
    System.arraycopy(elementData, 0, a, 0, size);
    if (a.length > size)
        a[size] = null;
    return a;
}

Use the first method so that only one object is created and avoid (implicit but nevertheless expensive) castings.

Meredith
  • 3,928
  • 4
  • 33
  • 58
Georgi
  • 4,402
  • 4
  • 24
  • 20
  • 1
    Two comments, might be of interest to someone: **1)** LinkedList.toArray(T[] a) is even slower (uses reflection: Array.newInstance) and more complex; **2)** On the other hand, in JDK7 release, I was very surprised to find out, that usually painfully-slow Array.newInstance performs nearly *as fast* as usual array creation! – java.is.for.desktop Jul 30 '11 at 10:46
  • 1
    @ktaria size is a private member of ArrayList, specifiying ****suprise**** the size. See [ArrayList SourceCode](http://grepcode.com/file/repository.grepcode.com/java/root/jdk/openjdk/6-b27/java/util/ArrayList.java#ArrayList) – MyPasswordIsLasercats Feb 14 '14 at 15:22
  • 3
    Guessing performance without benchmarks works only in trivial cases. Actually, `new Myclass[0]` is faster: https://shipilev.net/blog/2016/arrays-wisdom-ancients/ – Karol S Apr 06 '17 at 19:55
  • 1
    This is no longer valid answer as of JDK6+ – Антон Антонов Jun 07 '18 at 09:40
43

From JetBrains IntelliJ IDEA inspection:

There are two styles to convert a collection to an array:

  • A pre-sized array, for example, c.toArray(new String[c.size()])
  • An empty array, for example, c.toArray(new String[0])

In older Java versions, using a pre-sized array was recommended, as the reflection call necessary to create an array of proper size was quite slow.

However, since late updates of OpenJDK 6, this call was intrinsified, making the performance of the empty array version the same, and sometimes even better, compared to the pre-sized version. Also, passing a pre-sized array is dangerous for a concurrent or synchronized collection as a data race is possible between the size and toArray calls. This may result in extra nulls at the end of the array if the collection was concurrently shrunk during the operation.

Use the inspection options to select the preferred style.

Pang
  • 9,564
  • 146
  • 81
  • 122
  • 2
    If all of this is copied/quoted text, could we format it accordingly and also provide a link to the source? I actually came here because of the IntelliJ inspection and I'm very interested in the link to look up all of their inspections and the reasoning behind them. – Tim Büthe Aug 30 '18 at 16:20
  • 4
    Here you can check the inspections texts: https://github.com/JetBrains/intellij-community/tree/master/plugins/InspectionGadgets/src/inspectionDescriptions – Антон Антонов Aug 31 '18 at 14:35
18

Modern JVMs optimise reflective array construction in this case, so the performance difference is tiny. Naming the collection twice in such boilerplate code is not a great idea, so I'd avoid the first method. Another advantage of the second is that it works with synchronised and concurrent collections. If you want to make optimisation, reuse the empty array (empty arrays are immutable and can be shared), or use a profiler(!).

Tom Hawtin - tackline
  • 145,806
  • 30
  • 211
  • 305
  • 2
    Upvoting 'reuse the empty array', because it's a compromise between readability and potential performance that's worthy of consideration. Passing an argument declared `private static final MyClass[] EMPTY_MY_CLASS_ARRAY = new MyClass[0]` doesn't prevent the returned array from being constructed by reflection, but it _does_ prevent an additional array being constructed each each time. – Michael Scheper May 23 '13 at 01:31
  • Machael is right, if you use a *zero-length array* there is no way around: (T[])java.lang.reflect.Array.newInstance(a.getClass().getComponentType(), size); which would be superfluous in if the size would be >= actualSize (JDK7) – Alex May 31 '13 at 14:52
  • If you can give a citation for "modern JVMs optimise reflective array construction in this case", I'll gladly upvote this answer. – Tom Panning Nov 18 '13 at 19:30
  • I'm learning here. If instead I use: `MyClass[] arr = myList.stream().toArray(MyClass[]::new);` Would it help or hurt with synchronized and concurrent collections. And why? Please. – Pimp Trizkit Dec 04 '15 at 16:10
  • 1
    @PimpTrizkit when you invoke `.stream().toArray(MyClass[]::new)` on a synchronized collection, you lose the synchronization and have to synchronize manually. In case of a concurrent collection, it doesn’t matter, as both `toArray` approaches are only weakly consistent. In either case, calling `toArray(new MyClass[0])` on the collection directly is likely to be faster. (And to consider APIs introduced after your question, i.e. JDK 11+, calling `.toArray(MyClass[]::new)` directly on the collection just delegates to `.toArray(new MyClass[0])` because that is already the best method for the task.) – Holger Jan 28 '22 at 17:50
3

toArray checks that the array passed is of the right size (that is, large enough to fit the elements from your list) and if so, uses that. Consequently if the size of the array provided it smaller than required, a new array will be reflexively created.

In your case, an array of size zero, is immutable, so could safely be elevated to a static final variable, which might make your code a little cleaner, which avoids creating the array on each invocation. A new array will be created inside the method anyway, so it's a readability optimisation.

Arguably the faster version is to pass the array of a correct size, but unless you can prove this code is a performance bottleneck, prefer readability to runtime performance until proven otherwise.

Dave Cheney
  • 5,575
  • 2
  • 18
  • 24
2

The first case is more efficient.

That is because in the second case:

MyClass[] arr = myList.toArray(new MyClass[0]);

the runtime actually creates an empty array (with zero size) and then inside the toArray method creates another array to fit the actual data. This creation is done using reflection using the following code (taken from jdk1.5.0_10):

public <T> T[] toArray(T[] a) {
    if (a.length < size)
        a = (T[])java.lang.reflect.Array.
    newInstance(a.getClass().getComponentType(), size);
System.arraycopy(elementData, 0, a, 0, size);
    if (a.length > size)
        a[size] = null;
    return a;
}

By using the first form, you avoid the creation of a second array and also avoid the reflection code.

Panagiotis Korros
  • 10,840
  • 12
  • 41
  • 43
  • toArray() does not use reflection. At least as long as you do not count "casting" to reflection, anyway ;-). – Georgi Oct 06 '08 at 13:04
  • toArray(T[]) does. It needs to create an array of the appropriate type. Modern JVMs optimise that kind of reflection to be about the same speed as the non-reflective version. – Tom Hawtin - tackline Oct 06 '08 at 13:11
  • I think that it does use reflection. The JDK 1.5.0_10 does for sure and reflection is the only way I know to create an array of a type that you don't know at compile time. – Panagiotis Korros Oct 06 '08 at 13:13
  • Then one of the source code examples her (the one above or mine) is out-of-date. Sadly, I didn't find a correct sub-version number for mine, though. – Georgi Oct 06 '08 at 13:21
  • 1
    Georgi, your code is from JDK 1.6 and if you see the implementation of the Arrays.copyTo method you will see that the implementation uses reflection. – Panagiotis Korros Oct 06 '08 at 13:47
  • on modern JVMs (Java 8+ at least) this is no longer the case. the second option is faster, as the most voted answer here asserts. – Clint Eastwood Jan 14 '20 at 20:34
0

The second one is marginally mor readable, but there so little improvement that it's not worth it. The first method is faster, with no disadvantages at runtime, so that's what I use. But I write it the second way, because it's faster to type. Then my IDE flags it as a warning and offers to fix it. With a single keystroke, it converts the code from the second type to the first one.

MiguelMunoz
  • 4,548
  • 3
  • 34
  • 51
-3

Using 'toArray' with the array of the correct size will perform better as the alternative will create first the zero sized array then the array of the correct size. However, as you say the difference is likely to be negligible.

Also, note that the javac compiler does not perform any optimization. These days all optimizations are performed by the JIT/HotSpot compilers at runtime. I am not aware of any optimizations around 'toArray' in any JVMs.

The answer to your question, then, is largely a matter of style but for consistency's sake should form part of any coding standards you adhere to (whether documented or otherwise).

Matthew Murdoch
  • 30,874
  • 30
  • 96
  • 127