1

I have always read that we should use Vector everywhere in Java and that there are no performance issues, which is certainly true. I'm writing a method to calculate the MSE (Mean Squared Error) and noticed that it was very slow - I basically was passing the Vector of values. When I switched to Array, it was 10 times faster but I don't understand why.

I have written a simple test:

public static void main(String[] args) throws IOException {

    Vector <Integer> testV = new Vector<Integer>();
    Integer[] testA = new Integer[1000000];
    for(int i=0;i<1000000;i++){
        testV.add(i);
        testA[i]=i;
    }

    Long startTime = System.currentTimeMillis();
    for(int i=0;i<500;i++){
        double testVal = testArray(testA, 0, 1000000);
    }
    System.out.println(String.format("Array total time %s ",System.currentTimeMillis() - startTime));

    startTime = System.currentTimeMillis();
    for(int i=0;i<500;i++){
        double testVal = testVector(testV, 0, 1000000);
    }
    System.out.println(String.format("Vector total time %s ",System.currentTimeMillis() - startTime));

}

Which calls the following methods:

public static double testVector(Vector<Integer> data, int start, int stop){
    double toto = 0.0;
    for(int i=start ; i<stop ; i++){
        toto += data.get(i);
    }

    return toto / data.size();
}

public static double testArray(Integer[] data, int start, int stop){
    double toto = 0.0;
    for(int i=start ; i<stop ; i++){
        toto += data[i];
    }

    return toto / data.length;
}

The array one is indeed 10 times faster. Here is the output:

Array total time 854 Vector total time 9840

Can somebody explain me why ? I have searched for quite a while, but cannot figure it out. The vector method appears to be making a local copy of the vector, but I always thought that objects where passed by reference in Java.

Pawel Kozela
  • 484
  • 5
  • 12
  • How often did you run the comparison? Run each for-loop in your main method 10000 times and compare the median of it. – duffy356 Aug 18 '14 at 09:17
  • both Arrays and Vectors are objects and are passed by reference, so that is not a problem here. – EpicPandaForce Aug 18 '14 at 09:18
  • 5
    Are you sure that it said to always use Vector and not to never use Vector? – JonK Aug 18 '14 at 09:19
  • I'm running the method 500 times each (and actually in my real code the MSE calculation is called much more often) and difference is really big. – Pawel Kozela Aug 18 '14 at 09:20
  • 1
    Related reading: [Why is Java Vector class considered obsolete or deprecated](http://stackoverflow.com/questions/1386275/why-is-java-vector-class-considered-obsolete-or-deprecated) – JonK Aug 18 '14 at 09:20
  • Apart from the microbenchmark being flawed: A method signature should hardly ever contain such a concrete type. In the method signature, you could just use `List` (and with minor changes, it could be generalized to `Collection` or even `Iterable extends Number>`). In any case: For *real* "brute force number crunching performance", there's nothing faster than an simple, plain `int[]` array (albeit not very flexible and OOP, but *fast*). There are some potential issues with boxing/unboxing as well (beyond the scope of a comment, you'll find infos on the web) – Marco13 Aug 18 '14 at 10:37

4 Answers4

13

I have always read that we should use Vector everywhere in Java and that there are no performance issues, - Wrong. A vector is thread safe and thus it needs additional logic (code) to handle access/ modification by multiple threads So, it is slow. An array on the other hand doesn't need additional logic to handle multiple threads. You should try ArrayList instead of Vector to increase the speed

Note (based on your comment): I'm running the method 500 times each

This is not the right way to measure performance / speed in java. You should atleast give a warm-up run so as to nullify the effect of JIT.

TheLostMind
  • 35,966
  • 12
  • 68
  • 104
4

Yes, that's the eternal problem of poor microbenchmarking. The Vector itself is not SO slow.

Here is a trick:
add -XX:BiasedLockingStartupDelay=0 and now testVector "magically" runs 5 times faster than before!

Next, wrap testVector into synchronized (data) - and now it is almost as fast as testArray.

You are basically measuring the performance of object monitors in HotSpot, not the data structures.

apangin
  • 92,924
  • 10
  • 193
  • 247
0

Simple thing. Vector is thread-safe so it needs synchoronization to add and access. Use ArrayList which is also back-up by array but it is not thread-safe and faster

Note: Please provide size of the elements if you know in advance to ArrayList. Since in normal ArrayList without initial capacity resize will happen intenally which uses Arrays copy

And a normal array and ArrayList without initial capacity performances too varies drastically if no of elements is larger

Mohan Raj B
  • 1,015
  • 7
  • 14
0

Poor code, instead of list.get() rather use an iterator on the list. The array will still be faster though.

Rudi Strydom
  • 334
  • 3
  • 12