Sorting Test in Java -

Question

During the last months I created in Java some classes implementing data structures, more specifically Lists, Binary Search Trees and Binary Heaps. I decided to make a stress test by creating an Integer array of n values between 0 and 10*n, then sorting in various ways and measuring the times.

Initially it was just a curiosity. Obviously I expected my classes to costs far more than the normal Arrays.sort() method. However when I did the tests and compared my classes against each other I found some unexpected surprises.

This is the list of the tests, with details and comments.

1. A copy of the array is created, then the copy is sorted using the Arrays.sort() method, native of Java.
The time evaluated is that of the Arrays.sort() method, i.e. the creation of the copy of the array doesn't count.
It is the fastest way, as expected.

2. A list is created from the array, then the list is sorted using the Insertion Sort algorithm.
The time evaluated is that of the sorting method, i.e. the creation of the list from the array doesn't count.
Since the performance of Insertion Sort is not exactly great, this method costs around 50 times the array method.

3. A Binary Search Trees (BST from now on) is created from the array repeating the add() method.
The BST is not a balanced tree like the AVL or the Red-Black, merely a normal BST like found on Wikipedia: every node has links to three other nodes (parent, leftChild and rightChild), incapsulates a value, etc...
This method costs arount 500 times the list method, i.e. 25 000 times the array method.

4-5. Two Binary Heap (BH_1 and BH_2 from now on) are created from the array repeating the add() method, then they are converted into two (sorted) array repeating the extractMin() method. Both BH belongs to the same class and stores the value in a Vector<Integer>. Both BH costs around 2 times the BST and 50 000 times the array method. However there is a twist.
BH_2 creates the array using the method convertHeapToArray() of the Heap<Integer> interface. convertHeapToArray() calls the method extractMin() for n times, while extractMin() in turn calls the method heapify() one time.
This doesn't happen in BH_1, which uses the method convertHeapToArray_1(). Instead of calling extractMin(), my "new" method directly executes the code of extractMin() - and when extractMin() calls the method heapify(), BH_1 instead executed its code. In short, a copy-paste that avoids a few calls.
In theory BH_1 should always costs less than BH_2: same input data, same code, less method calling. However this is true only 73% of the times!

My questions are the following:

1. Why the Binary Search Trees sorting method (computational complexity n log(n), expected to be balanced if created by add() only) costs 500 times an Insertion Sort (computational complexity n², is preferable if n is less than 23)

2. Why the Binary Heaps (same computational complexity of Binary Search Trees, specifically designed to sort quickly) costs 2 times a Binary Search Trees?

3. And, more baffling than ever, why making less calls is more costly than making more calls in one case out of 4?!

Code for convertHeapToArray():

    public default T[] convertHeapToArray(){
        T[] output = AArrays.createTarray(length(), min());

        for(int i=0 ; i<length() ; i++)
            output[i] = this.extractMin();
        return output;
    }
public T extractMin() {
    T min = storer.get(indexOfMinimum);
    AVector.swap(storer, indexOfMinimum, length);
    AVector.swap(storer, indexOfMinimum, length);
    length--;
    heapify(indexOfMinimum);
    return min;
}

Report (5000 tests, 100 random arrays each):

The array use a Comparator<Integer>.
 A Comparator<Integer> executes a confront in 66083 nanoseconds.
The list use a Comparator<NodeList<Integer>>.
 A Comparator<NodeList<Integer>> executes a confront in 85973 nanoseconds.
The BST, BH_1 and BH_2 use a Relationship<Integer>.
 A Relationship<Integer> executes a confront in 107145 nanoseconds.

The total time for the array sorting is 239 717 392 nanoseconds.
The total time for the list sorting is 984 872 184 nanoseconds.
The total time for the BST sorting is 533 338 808 811 nanoseconds.
The total time for the BH_1 sorting is 1 055 836 440 689 nanoseconds.
The total time for the BH_2 sorting is 1 198 365 741 676 nanoseconds.

The medium time for the array sorting is 47 943 nanoseconds.
The medium time for the list sorting is 196 974 nanoseconds.
The medium time for the BST  sorting is 106 667 761 nanoseconds.
The medium time for the BH_1 sorting is 211 167 288 nanoseconds.
The medium time for the BH_2 sorting is 239 673 148 nanoseconds.
The first method for the Binary Heap has been faster than the second for 3 634 times out of 5 000.

EDIT:

Re-reading what I wrote, I realized I wasn't exactly clear in my initial question. Allow me to rectify my mistake.

I am aware that there is a difference between the actual time executed by the program and the computational complexity. I never had any doubt on the computational complexity of the methods used: the data structures are simple, their code is pratically taken from Wikipedia . I was certain that the written code was not well performing. It wasn't written with performance to begin with.

My test is on the actual time of execution. The Arrays.sort() method was included as a reference parameter. Since I didn't wrote the code with performance in mind, I suspected before the test that the results would be more costly than expected. However my prediction on what exactly was costing more than it should were thrown into the garbage bin by the results.

For example I believed that less method calls would results in a lesser cost, but I was wrong. The whole difference between the two Binary Heaps was that the first executed the same code of the second, just with less method calls (minimal code for the BinaryHeap included below). I expected that the first Binary Heap would always have a lesser cost: this was proven wrong, and I don't know why.

Another thing that baffled me was the fact that the Binary Heap costed more than the Binary Search Tree. The array used is created randomly. While with this starting condition the height of a Binary Search Tree is expected to be around log(n) (chapter 12, paragraph 4 of Introduction to Algorithm, by Cormen, Leiserson, Rivest, Stein), I never heard of an algorithm using Binary Search Trees to sort arrays: I included it in my test only as a curiosity. And yet for the small number of elements (initially 100) the Binary Search Trees was found consistently quicker than the Binary Heap.

Why this happens? When a Binary Heap starts being more convenient? Was including the conversion Heap»Array a mistake?

A third unexpected result is the performance of the double-linked List. As with the Binary Search Trees, I included in the test as a curiosity. The sorting algorithm is a very basic Insertion Sort, which is faster only for a number of element far less than what I used, and the code of the entire class is most definitively not created for quickness. I supposed it would have revealed as the most slow of my classes, instead it was the quickest! And I still don't know why.

This is why I asked advice. I didn't included the code because between all the classes it is around the ~3000 lines of code, most of which is unused by the test. I would not have read 3k lines of code, nor I expected the random stackoverflow-er to do so! However I included the code for the BinaryHeap<T>, which is ~300 lines.

Code for BinaryHeap<T> :

public class BinaryHeap<T> implements Heap<T> {
    private static final int indexOfMinimum = 1;
    private Vector<T> storer = new Vector<T>(indexOfMinimum + 10);
    private Relationship<T> relationship = null;
    // The class Relationship<T> has a single method whose signature is 
    // public boolean confront(T a, T b);
    // The reason I included it instead of a Comparator is that, in my code, the value null represents -∞
    // 
    // The following code works.
    // 
    // public class Relationship<T>{
    //  Comparator<T> c;
    //  public Relationship<T>(Comparator<T> c){    this.c = c; }
    //  public boolean confront(T a, T b){  return a==null ? true : c.compare(a,b); }
    // }

    // TODO | Constructors

    public BinaryHeap(Relationship<T> relationship){
        storer.add(null);
        this.relationship = relationship;
    }

    // TODO | Methods of the interface Heap

    public void add(T newData) {
        storer.add(null);
        updateValue(storer.size() - 1, newData);
    }
    public T extractMin() {
        T min = storer.get(indexOfMinimum);
        heapify(indexOfMinimum);
        return min;
    }
    public void updateValue(int indexOfToBeUpgraded, T newValue) {
        int i = indexOfToBeUpgraded;
        T support;
        if( i >= indexOfMinimum )
            {
            storer.set(i, newValue);
            while( i > indexOfMinimum && ! relationship.confront(storer.get(i/2), storer.get(i)) )
                {
                support = storer.get(i);
                storer.set(i, storer.get(i/2));
                storer.set(i, support);
                i = i/2;
                }
            }
    }
    private void heapify(int i){
        int j = i;
        int maximumIndexOfArray = storer.size();
        while( j <= maximumIndexOfArray/2 )         // i.e. if H[j] isn't a leaf of the Tree
            {
            int indexOfLocalMinimum =   relationship.confront(storer.get(j),    storer.get(2*j))                        
                        ?   (   relationship.confront(storer.get(  j), storer.get(2*j+1))   ?     j : 2*j+1 )
                        :   (   relationship.confront(storer.get(2*j), storer.get(2*j+1))   ?   2*j : 2*j+1 )   ;
            if( j != indexOfLocalMinimum )
                {
                AVector.swap(storer, j, indexOfLocalMinimum);
                j = indexOfLocalMinimum;
                }
            else j = maximumIndexOfArray;
            }
    }
    public default T[] convertHeapToArray(){
    T[] output = (T[]) Array.newInstance(min().getClass(), length());
        for(int i=0 ; i<length() ; i++)
            output[i] = this.extractMin();
        return output;
    }

    // TODO | Second version of the method convertHeapToArray, found out not to be improved

    public T[] convertHeapToArray_1(){
        int length = length(), j;
    T[] output = (T[]) Array.newInstance(min().getClass(), length());
        for(int i=0 ; i<length ; i++)
            {
    //      output[i] = this.extractMin();
                output[i] = storer.get(indexOfMinimum);
        //      heapify(indexOfMinimum);
                    j = indexOfMinimum;
                    int maximumIndexOfArray = storer.size();
                    while( j <= maximumIndexOfArray/2 )         // i.e. if H[j] isn't a leaf of the Tree
                        {
                        int indexOfLocalMinimum =   relationship.confront(storer.get(j),    storer.get(2*j))                        
                                    ?   (   relationship.confront(storer.get(  j), storer.get(2*j+1))   ?     j : 2*j+1 )
                                    :   (   relationship.confront(storer.get(2*j), storer.get(2*j+1))   ?   2*j : 2*j+1 )   ;
                        if( j != indexOfLocalMinimum )
                            {
                            AVector.swap(storer, j, indexOfLocalMinimum);
                            j = indexOfLocalMinimum;
                            }
                        else j = maximumIndexOfArray;
                        }
                }
        return output;
    }

This is quite unclear - you haven't shown us any code, so we don't really know what you're measuring/comparing. (Note also http://stackoverflow.com/questions/504103/how-do-i-write-a-correct-micro-benchmark-in-java). — Oliver Charlesworth, Apr 15 '17 at 18:44
(1) You'll need to show us a complete (and short) program. (2) An unbalanced BST is inefficient, that's why you balance it. No, it is not "expected to be balanced if created by add() only". The sort will take O(n^2). (3) The binary heap taking long could very well be related to an inefficiency in your implementation. (4) Do try to stick to one question. (5) Keep it short and to the point, including all the backstory will make people skip or downvote the question. The first way isn't even included in any question so there's no reason to include it and have it take up our reading time. — Bernhard Barker, Apr 15 '17 at 19:02
If you don't post your code, we don't even know whether you have faithfully reproduced the algorithms that you are allegedly comparing. — Mike Nakis, Apr 15 '17 at 19:36
Also, it's going to depend on the size of the input and how you're generating it. Saying it took 50 times longer doesn't really mean much without context. — Bernhard Barker, Apr 15 '17 at 19:41
For what value/s of `n` have you tested? That is, how long an array? — Ole V.V., Apr 15 '17 at 20:03
I apologise for what I initially wrote: re-reading it I realized it wasn't clear, at all.
I have edited my first post including: an answer to all previous comments, a reply form @MikeNakis, a _far_ better explanation of my question. Again, my apologise. — Asghabard, Apr 16 '17 at 19:47

Mike Nakis · Answer 1 · 2017-04-21T13:05:48.847

Computational complexity does not measure nanoseconds, or milliseconds, or anything of that sort. It measures how the running time of an algorithm varies as a function of the size of its input, and it says nothing about the efficiency of the code that me or you might write to implement the algorithm.

Now, when you write an actual implementation of an algorithm you are introducing overhead which depends on a number of factors that computational complexity theory just does not care about.

The performance of your code depends on your language of choice, your execution environment, programming choices that you make, how experienced you are at writing performing code and avoiding common performance pitfalls, etc.

Furthermore, when you test the performance of your code, a lot depends on how well you know how to do that in your execution environment. With bytecode-translated, just-in-time-compiled, garbage-collected languages like java, on systems with hundreds of simultaneously running processes, it is quite quirky.

So, the answer to your question is that your comparisons are unequal because a) you have not written well performing code, b) certain things cost a lot more than you might expect in java, and c) the system you are trying to benchmark is more chaotic and less closed than you think.

In order to do a test which is true to computational complexity theory you would have to measure the performance of your code theoretically. This means that you would need to count not nanoseconds, but theoretical operations. Those would be the number of node accesses and the number of node creations (in trees.)

However, computational complexity theory still holds, so even if you insist on counting time, if you performance-test your algorithms for several orders of magnitude larger Ns, (which might mean allowing them to run for years, not hundreds of nanoseconds,) you should eventually begin to see the differences predicted by theory, because on the long run, log N beats N, which in turn beats N log N, which in turn beats N², despite the inequalities introduced by badly performing code.

Of course there is always a chance that the algorithms and data structures that you are performance-testing have completely different computational complexity characteristics than what what we think they have, due to implementation tricks. For example, the linked list that you are performance-testing could internally employ a hash-map as an aid to boost its performance, who knows. But we would only be able to judge that if you posted your code so that we can see exactly what is going on.

Sorting Test in Java -

1 Answers1