0

In my code, the method will call itself (recursion). We know, a deep method call will lead to stack overflow. So, will a stack overflow happen when there is a big number of data?

My QuickSort code: (inner class of a sort class)

private static class QuickSortor extends Sortor {

    protected <E> void sort(E[] a, boolean isAsc) {
        quickSort(a, 0, a.length - 1, isAsc);
    }

    private <E> void quickSort(E[] a, int left, int right, boolean isAsc) {
        if (left >= right)
            return;
        int middle = left;
        Comparable<E> cmp = (Comparable<E>) a[left];
        for (int i = left + 1; i <= right; i++) {
            int result = cmp.compareTo(a[i]);
            if (!isAsc)
                result = -result;
            if (result >= 0)
                swap(a, ++middle, i);
        }
        swap(a, left, middle);
        quickSort(a, left, middle - 1, isAsc);
        quickSort(a, middle + 1, right, isAsc);
    }
}
user unknown
  • 35,537
  • 11
  • 75
  • 121
King
  • 1
  • Have you tried analyzing the stack depth complexity? You will find out the answer by doing that. – nhahtdh Jun 20 '12 at 03:45
  • 1
    It's almost certainly infinite recursion. To cause a stack overflow in a correct function would require a _massive_ array, so that sounds much less likely. Check for off-by-one errors, as this is probably the cause. – Oleksi Jun 20 '12 at 03:48
  • 1
    Consider the case where the pivot is (unluckily) always chosen to be the smallest or largest element. Then you'll have a depth of O(n) calls... – Mysticial Jun 20 '12 at 03:52

2 Answers2

5

How deep you recurse, and how long quicksort takes, depends on the relative order of the data points as well as on how many there are. If the pivot point always ends up being the largest or smallest member of the data you are recursing on the code above will recurse as deep as the data size.

With quicksort, you can ensure that the recursion depth is small, even if it still ends up taking a long time. This is discussed very briefly in http://en.wikipedia.org/wiki/Quicksort as follows:

To make sure at most O(log N) space is used, recurse first into the smaller half of the array, and use a tail call to recurse into the other.

Very briefly this means you first of all rewrite

f(...)
{
  ...
  f()
  f()
}

as

f()
{
  while (...)
  {
    ...
    f()
    ...
 }
}

(using the while loop to modify the arguments to f so that the second recursive call corresponds to the second pass round the while loop).

Then you look at how you have split up the array for the two recursive calls and you make sure that the second recursive call - which gets turned into the while loop - is for the larger half of the array. This means that the remaining recursive call is always for no more than half the array, which means that you never go deeper than log base two of the array size.

Emil Vikström
  • 90,431
  • 16
  • 141
  • 175
mcdowella
  • 19,301
  • 2
  • 19
  • 25
  • 1
    Good optimization advice! This would be even easier if Java supported tail call optimization, in which case you just had to put your two recursive calls in the right order (larger last), but it looks like it [may not happen any day soon](http://stackoverflow.com/questions/105834/does-the-jvm-prevent-tail-call-optimizations). – Emil Vikström Jun 20 '12 at 05:18
2

Your algorithm as written always picks the leftmost element as the pivot. If executed on an already-sorted array, this will exhibit the pathological O(n^2) behavior of QuickSort, which will cause a stack overflow with quite a small array.

You can try it out by invoking your sort on an array of Integers of the numbers from 0 to something like 15000 in order. I get a StackOverflowError trying it out. Executing it on an array of 15000 random Integers does not cause a stack overflow.

jimr
  • 11,080
  • 1
  • 33
  • 32
  • A common way to decrease the risk of this happening is to read the first, middle and last elements of the array and pick the best pivot out of the three. – Emil Vikström Jun 20 '12 at 05:06
  • You are wrong, O(n²) is the time complexity (which we dan't care about here) and not the stack space complexity (which is O(n) in the worst case). – Thomash Jun 20 '12 at 09:16
  • I didn't say that it was the stack space complexity, just that the pathological case can make this implementation have a stack overflow with a small-ish array. – jimr Jun 20 '12 at 19:52