I was going through the following article:
Understanding Collections and Thread Safety in Java
The article says:
You know, Vector and Hashtable are the two collections exist early in Java history, and they are designed for thread-safe from the start (if you have chance to look at their source code, you will see their methods are all synchronized!). However, they quickly expose poor performance in multi-threaded programs. As you may know, synchronization requires locks which always take time to monitor, and that reduces the performance.
[I've also done a benchmark using Caliper; please hear me out on this]
A sample code has also been provided:
public class CollectionsThreadSafeTest {
public void testVector() {
long startTime = System.currentTimeMillis();
Vector<Integer> vector = new Vector<>();
for (int i = 0; i < 10_000_000; i++) {
vector.addElement(i);
}
long endTime = System.currentTimeMillis();
long totalTime = endTime - startTime;
System.out.println("Test Vector: " + totalTime + " ms");
}
public void testArrayList() {
long startTime = System.currentTimeMillis();
List<Integer> list = new ArrayList<>();
for (int i = 0; i < 10_000_000; i++) {
list.add(i);
}
long endTime = System.currentTimeMillis();
long totalTime = endTime - startTime;
System.out.println("Test ArrayList: " + totalTime + " ms");
}
public static void main(String[] args) {
CollectionsThreadSafeTest tester = new CollectionsThreadSafeTest();
tester.testVector();
tester.testArrayList();
}
}
The output they have provided for the above code is as follows:
Test Vector: 9266 ms
Test ArrayList: 4588 ms
But when I ran it in my machine, it gave me the following result:
Test Vector: 521 ms
Test ArrayList: 2273 ms
I found this to be quite odd. I thought doing a micro benchmark would be better. So, I wrote a benchmark for the above using caliper. The code is as follows:
public class CollectionsThreadSafeTest extends SimpleBenchmark {
public static final int ELEMENTS = 10_000_000;
public void timeVector(int reps) {
for (int i = 0; i < reps; i++) {
Vector<Integer> vector = new Vector<>();
for (int k = 0; k < ELEMENTS; k++) {
vector.addElement(k);
}
}
}
public void timeArrayList(int reps) {
for (int i = 0; i < reps; i++) {
List<Integer> list = new ArrayList<>();
for (int k = 0; k < ELEMENTS; k++) {
list.add(k);
}
}
}
public static void main(String[] args) {
String[] classesToTest = { CollectionsThreadSafeTest.class.getName() };
Runner.main(classesToTest);
}
}
But I got a similar result:
0% Scenario{vm=java, trial=0, benchmark=ArrayList} 111684174.60 ns; ?=18060504.25 ns @ 10 trials
50% Scenario{vm=java, trial=0, benchmark=Vector} 67701359.18 ns; ?=17924728.23 ns @ 10 trials
benchmark ms linear runtime
ArrayList 111.7 ==============================
Vector 67.7 ==================
vm: java
trial: 0
I'm kinda confused. What is happening here? Am I doing something wrong here (that would be really embarrassing) ?
If this is the expected behavior, then what is the explanation behind this?
Update #1
After reading @Kayaman's answer, I ran the caliper tests by changing the values of the initial capacities of both the Vector
and the ArrayList
. Following are the timings (in ms):
Initial Capacity Vector ArrayList
-------------------------------------
10_000_000 49.2 67.1
10_000_001 48.9 71.2
10_000_010 48.1 61.2
10_000_100 43.9 70.1
10_001_000 45.6 70.6
10_010_000 44.8 68.0
10_100_000 52.8 64.6
11_000_000 52.7 71.7
20_000_000 74.0 51.8
-------------------------------------
Thanks for all the inputs :)