8

I was trying to do an implementation of QuickSort (with median of 3 partitioning-element and insertion sort for small arrays) and compare it to an implementation of MergeSort, but even when QuickSort average time should be better than that of MergeSort, every time I execute it, it seems to be taking more time to sort an array (even one in random order). Any idea why can this be happening?

QuickSort:

public class Quick {

    private static final int M = 10;
    private static Random random = new Random();

    public void sort(int[] a) {
        sort(a, 0, a.length - 1);
        insertionSort(a, 0, a.length - 1);
    }

    private void sort(int[] a, int lo, int hi) {
        if (hi - lo <= 10) return;
        swap(a, lo, pivot(a, lo, hi));
        int lt = lo, gt = hi;
        int v = a[lo];
        int i = lo;
        while (i <= gt) {
            if      (a[i] < v) swap(a, lt++, i++);
            else if (a[i] > v) swap(a, i, gt--);
            else              i++;
        }

        // a[lo..lt-1] < v = a[lt..gt] < a[gt+1..hi].
        sort(a, lo, lt-1);
        sort(a, gt+1, hi);
    }

    private int pivot(int[] a, int lo, int hi) {
        int     r1 = random.nextInt(hi - lo) + lo,
                r2 = random.nextInt(hi - lo) + lo,
                r3 = random.nextInt(hi - lo) + lo;
        return median3(a, r1, r2, r3);
    }

    private void swap(int[] a, int i, int j) {
        if (i == j) return;
        int tmp = a[i];
        a[i] = a[j];
        a[j] = tmp;
    }

    private static int median3(int[] a, int i, int j, int k) {
        return (a[i] < a[j] ?
                (a[j] < a[k] ? j : a[i] < a[k] ? k : i) :
                (a[k] < a[j] ? j : a[k] < a[i] ? k : i));
    }

    private void insertionSort(int[] a, int lo, int hi) {
        for (int i = lo; i <= hi; i++)
            for (int j = i; j > lo && a[j] < a[j-1]; j--)
                swap(a, j, j-1);
    }
}

MergeSort:

public class Merge {

    public void sort(int[] a) {
        int[] aux = new int[a.length];
        sort(a, aux, 0, a.length - 1);
    }

    private void sort(int[] a, int[] aux, int lo, int hi) {
        if (hi <= lo) return;
        int mid = lo + (hi - lo) / 2;
        sort(a, aux, lo, mid);
        sort(a, aux, mid + 1, hi);
        merge(a, aux, lo, mid, hi);
    }

    private void merge(int[] a, int[] aux, int lo, int mid, int hi) {
        System.arraycopy(a, lo, aux, lo, hi + 1 - lo);
        // Merge
        int i = lo, j = mid + 1;
        for (int k = lo; k <= hi; k++) {
            if      (i > mid)           a[k] = aux[j++];
            else if (j > hi)            a[k] = aux[i++];
            else if (aux[j] < aux[i])   a[k] = aux[j++];
            else                        a[k] = aux[i++];
        }
    }
}

For example, when I ran this algorithms for 10 random arrays of length 10^8, MergeSort took an average of 13385.8 ms to execute, while QuickSort took an average of 14096.7 ms.

To clarify, this is how I measured execution times

public static void main(String[] args) {
    int pow = (int) Math.pow(10, 8);
    int[] a, b;
    double[]    m1 = new double[10],
                m2 = m1.clone(),
    double st, et;
    Merge merge = new Merge();
    Quick quick = new Quick();
    for (int i = 0; i < 10; i++) {
        a = randomArray(pow);
        b = a.clone();
        st = currentTimeMillis();
        merge.sort(a);
        et = currentTimeMillis();
        m1[i] = et - st;
        st = currentTimeMillis();
        quick.sort(b);
        et = currentTimeMillis();
        m2[i] = et - st;
    }
    out.println("MergeSort: " + mean(m1));
    out.println("QuickSort: " + mean(m2));
}
private static int[] randomArray(int n) {
    r = new Random();
    int[] a = new int[n];
    for (int i = 0; i < a.length; i++) a[i] = r.nextInt();
    return a;
}
Roäc
  • 149
  • 19
  • 2
    why are you calling `insertionSort` after `sort` in the first `sort` ? and how did you measure performance ? – niceman Jul 06 '16 at 17:10
  • `when I ran this algorithms for 10 random arrays of length 10^8` Please post the code used to profile. – copeg Jul 06 '16 at 17:11
  • @copeg Thanks for pointing that, I forgot to iclude the measuring part. – Roäc Jul 06 '16 at 17:24
  • 3
    Without looking at the code: asymptotic worst-case time complexity is not the same as measured wall clock run time. And anyway, merge sort is O(n log n) expected time, just like quick sort in non-pathological cases. – Thomas Jul 06 '16 at 17:25
  • @niceman As far as I know, InsertionSort works better than Quicksort for small arrays, so for arrays that are smaller than M = 10 I stopped the recursion and called InsertionSort over the complete array, that now is "almost sorted", so it should be faster. – Roäc Jul 06 '16 at 20:24
  • 2
    I don't see `if a.length<10 insertionSort(a)` , what I see is you're sorting the array twice :) – niceman Jul 07 '16 at 10:01
  • @niceman if I did it that way, I would call insertionsort many times, instead of that, when there's an array small enough, I stop the recursión and then call insertionsort just one time. Calling insertionsort does improve the performance of quicksort, and it's a known improvement for the algorithm. – Roäc Jul 07 '16 at 17:41
  • I don't think the optimization of insertion sort works like that, see [this](http://stackoverflow.com/questions/12454866/how-to-optimize-quicksort), the answer says to "remember indexes" and "call insertionSort as batch", you're calling `insertionSort` and not saving any indexes – niceman Jul 07 '16 at 19:31
  • @niceman I may be wrong, but I think that saving indexes is for a more complicated (and maybe more efficient) implementation of `quickSort` that takes advantage of the tail recursion. Now, maybe I should mention that when I tested the program with and without `insertionSort`, the hybrid sorting method was actually faster, but for what I know, even the basic version of quicksort should be faster than mergesort (at least for a random array), and that doesn't show up in my results. Also, I took advantage of the fact that `insertionSort` is efficient for almost sorted arrays (close to _O(n)_) – Roäc Jul 07 '16 at 21:18
  • check [this](https://github.com/Pahan-Madusha/useful/tree/master/Merge%20Sort%20) and [this](https://github.com/Pahan-Madusha/useful/tree/master/Quick%20Sort) out some performance measuring codes for merge sort and quick sort – pahan Jul 17 '16 at 09:01
  • When you do `a = randomArray(pow);` and `b = a.clone();`, only an array of same size of `a` will be created for `b`. the elements of `a` won't be copied to `b`. Which means the array `b` will have it's default value which is 0. This means merge sort and quick sort are not getting the same arrays to sort. Use `copyOf` method instead of `clone` and try again. – pahan Jul 17 '16 at 09:12
  • If pow = (int) Math.pow(10, 6); I get following values: MergeSort: 286.9 QuickSort: 232.9 If pow = (int) Math.pow(10, 7); I get following values: MergeSort: 3622.2 QuickSort: 2414.7 Clearly mergesort is taking more time? " private static double mean(double[] m1) { double total = 0; for(double val: m1) { total += val; } return total/m1.length; }" is my mean function. – Madhu V Rao Jul 19 '16 at 00:27
  • @MadhuVRao Well, thanks, you actually gave me an idea, I tested the program in another computer (a slower one, btw), and it gave me totally different results, because quicksort seems to be more efficient than mergesort in this machine. I'm still intrigued in why that could happen though. – Roäc Jul 20 '16 at 16:21
  • Check [“How do I write a correct micro-benchmark in Java?”](http://stackoverflow.com/q/504103/2711488) – Holger Sep 26 '16 at 17:02

1 Answers1

1

After trying lots of things to find out where the issue was, I changed the function that created the random array, and it turns out QuickSort actually works faster now. I don't really know why, but it seems that the way I created the array adversely affected the performance of QuickSort. Now, what I did was that instead of generating an array of random integers, I generated an array form 1 to n and then shuffled it, as follows:

public static int[] permutation(int n) {
    int[] a = new int[n];

    for (int i = 0; i < n; ++i)  a[i] = i + 1;

    for (int i = 0; i < n; i++) {
        int r = i + rnd.nextInt(n - i),
            tmp = a[i];

        a[i] = a[r];
        a[r] = tmp;
    }

    return a;
}
Roäc
  • 149
  • 19