I was trying to see the performance difference between pre-initializing an ArrayList
to a given capacity vs going with the default capacity and expanding on requirement. Just out of curiosity. I found that the default capacity array code is ~10% FASTER than the code which initializes the array to the required capacity. Here is the code I used:
public class Test {
public static void main(String[] args) {
long t1 = System.currentTimeMillis();
for(int j=0;j<10;++j) {
List<Integer> list = new ArrayList<Integer>();
for(int i=0;i<1000000;++i) {
list.add(i);
}
}
long t2 = System.currentTimeMillis();
System.out.println("Time taken: " + (t2-t1)/10.0);
}
}
I consistently get ~77 ms for this version on my box, whereas I get ~85 ms if I change the List initialization to new ArrayList<Integer>(1000000)
. Why is it so? Shouldn't it be the other way round? In fact, the List without pre-init is consistently a tiny bit (~0.5-1 ms) faster than using plain Integer[]
. Basically, what it says is default capacity arraylist > simple array > pre-capacity-ensured arraylist
in insert performance.
This is very strange to me. My initial guess is that it has something to do with memory allocation, something like giving 1000000 int blocks in one go is maybe slower than slowly getting more space? Is this even reproducible in other machines? I am using jdk 1.6.0, Mac OS X, running through eclipse.
I tried it in two other environments:
--> Tried running java+javac from command line instead of eclipse - Here I get pre-capacity-ensured arraylist > simple array > default capacity arraylist
, consistently.
--> Tried running java+javac on my linux (RHEL) desktop. This box has 24 GB RAM, whereas my laptop had only 8 GB. Here, I get plain array >>> default capacity arraylist > pre-capacity-ensured arraylist
. Plain array is super-fast, about 2-3x faster in this case than the other two.
EDIT: Following @JonSkeet's suggestion in the comments, I used nanoTime()
, and Integer
instead of int
. It still doesn't adress the issue that JIT warmup is not being considered though. After these changes, I consistently see that plain array is the fastest across all tests. But the capacity-ensured list is still slower by about 5-10% compared to default-capacity list for me across all 3 environments above. BUT some users seem to get the correct behaviour, so this might well be a very specific case.
EDIT2: If I use String instead of Integer as the element, the behavior is correct (plain array > pre-capacity-ensured arraylist > default capacity array
). So I think autoboxing is actually the culprit.