5

So I was assigned to implement a quick sort algorithm and compare the running times for arrays with size 500, 3500, and 80000. The arrays are populated with random numbers:

(int)Math.random()*1000000

My quicksort algorithm works for the arrays with size 500 and 3500, but I always get a stackoverflow error when I try to sort the third array with size 80000. My other sorting algorithms handle these arrays just fine.

My quick sort method:

public static void quickSort(int[] a, int p, int r)
{
    if(p<r)
    {
        int q=partition(a,p,r);
        quickSort(a,p,q);
        quickSort(a,q+1,r);
    }
}

My partition method:

private static int partition(int[] a, int p, int r) {

    int x = a[p];
    int i = p;
    int j = r;

    while (true) {
        do {
            i++;
        } while (i < r && a[i] < x);
        do {
            j--;
        } while (j > p && a[j] > x);

        if (i < j) {
            int tmp = a[i];
            a[i++] = a[j];
            a[j--] = tmp;
        } else {
            return j;
        }
    }
}

I read that I could simply change my stack size in the VM options (not sure how to do that) but that is just ignoring the problem in my algorithm. What's causing the error? Thanks!

My driver class:

public class Driver {

    public static void main(String[] args) {

        int[] array1 = new int[500];
        int[] array2 = new int[3500];
        int[] array3 = new int[80000];

        for(int i=0; i<array1.length; i++) {
            array1[i]=(int)(Math.random()*100000);
        }

        for(int i=0; i<array2.length; i++) {
            array2[i]=(int)(Math.random()*100000);
        }

        for(int i=0; i<array3.length; i++) {
            array3[i]=(int)(Math.random()*100000);
        }

        //~~~~~~~~~~~INSERTION~~~~~~~~~~~~~~~//

        System.out.println("INSERTION SORT:\n_______________");
        System.out.println("500 Elements: "+SortTimes.runTime(SortTimes.INSERTION,array1)+" ms");
        System.out.println("3500 Elements: "+SortTimes.runTime(SortTimes.INSERTION,array2)+" ms");
        System.out.println("80000 Elements: "+SortTimes.runTime(SortTimes.INSERTION,array3)+" ms");

        //~~~~~~~~~~~BUBBLE~~~~~~~~~~~~~~~//

        System.out.println("\n\nBUBBLE SORT:\n_______________");
        System.out.println("500 Elements: "+SortTimes.runTime(SortTimes.BUBBLE,array1)+" ms");
        System.out.println("3500 Elements: "+SortTimes.runTime(SortTimes.BUBBLE,array2)+" ms");
        System.out.println("80000 Elements: "+SortTimes.runTime(SortTimes.BUBBLE,array3)+" ms");

        //~~~~~~~~~~~MERGE~~~~~~~~~~~~~~~//

        System.out.println("\n\nMERGE SORT:\n_______________");
        System.out.println("500 Elements: "+SortTimes.runTime(SortTimes.MERGE,array1)+" ms");
        System.out.println("3500 Elements: "+SortTimes.runTime(SortTimes.MERGE,array2)+" ms");
        System.out.println("80000 Elements: "+SortTimes.runTime(SortTimes.MERGE,array3)+" ms");

        //~~~~~~~~~~~QUICK~~~~~~~~~~~~~~~//

        System.out.println("\n\nQUICK SORT:\n_______________");
        System.out.println("500 Elements: "+SortTimes.runTime(SortTimes.QUICK,array1)+" ms");
        System.out.println("3500 Elements: "+SortTimes.runTime(SortTimes.QUICK,array2)+" ms");
        System.out.println("80000 Elements: "+SortTimes.runTime(SortTimes.QUICK,array3)+" ms");
    }
}

Here is my SortTimes class:

public class SortTimes {

    public final static int MERGE = 1;
    public final static int QUICK = 2;
    public final static int BUBBLE = 3;
    public final static int INSERTION = 4;

    public static double runTime(int sortMethod, int[] array) {

        double startTime;
        double endTime;

        switch(sortMethod) {
            case MERGE:
                startTime = System.currentTimeMillis();
                lab12.mergeSort(array);
                endTime = System.currentTimeMillis();
                break;

            case QUICK:
                startTime = System.currentTimeMillis();
                lab12.quickSort(array, 0, array.length-1);
                endTime = System.currentTimeMillis();
                break;

            case BUBBLE:
                startTime = System.currentTimeMillis();
                lab12.bubbleSort(array);
                endTime = System.currentTimeMillis();
                break;

            case INSERTION:
                startTime = System.currentTimeMillis();
                lab12.insertionSort(array);
                endTime = System.currentTimeMillis();
                break;

            default:
                startTime = -1;
                endTime = 0;
                break;
        }

        return endTime-startTime;
    }
}
greybeard
  • 2,249
  • 8
  • 30
  • 66
Sam Nayerman
  • 135
  • 1
  • 1
  • 8
  • 4
    The recursion depth on large arrays is causing the stack overflow. – Elliott Frisch Nov 24 '15 at 01:51
  • are you running it from eclipse ? http://stackoverflow.com/questions/2127217/java-stack-overflow-error-how-to-increase-the-stack-size-in-eclipse – Quantico Nov 24 '15 at 01:52
  • @Quantico - I'm using IntelliJ. – Sam Nayerman Nov 24 '15 at 01:53
  • @ElliottFrisch I figured as much, but how do I fix this? – Sam Nayerman Nov 24 '15 at 01:53
  • http://stackoverflow.com/questions/13578062/how-to-increase-ide-memory-limit-in-intellij-idea-on-mac – Quantico Nov 24 '15 at 01:54
  • @SamNayerman One common solution is to bail out of recursion when the number of elements is small, and switch to a different sort (like [*insertion sort*](https://en.wikipedia.org/wiki/Insertion_sort), the linked Wikipedia says in part, *A useful optimization in practice for these algorithms is to use insertion sort for sorting small sublists, where insertion sort outperforms these more complex algorithms.*). – Elliott Frisch Nov 24 '15 at 01:59
  • @ElliottFrisch I wouldn't use recursive sorting methods for small arrays unless I had to, but unfortunately that's the way my assignment is designed. – Sam Nayerman Nov 24 '15 at 02:03
  • @SamNayerman : This may not be an issue of the implementation of the algorithm. Just simply increase the amount of memory allocated for your program by the IDE and check. – tharindu_DG Nov 24 '15 at 02:13
  • @tharindu_DG I would but I don't know how! The link that was posted above is for OSX and I don't really know how to solve this issue – Sam Nayerman Nov 24 '15 at 02:15
  • @SamNayerman : check the answer for this question : http://stackoverflow.com/questions/18631038/setting-heap-size-in-intellij-idea-correctly -Xmx 1024m means 1 GB is allocated for the program. – tharindu_DG Nov 24 '15 at 02:19
  • It's possible to drastically reduce the recursion depth of quicksort, from O(n) to O(log n). To do so, recurse on the smaller half, and tail-recurse on the larger half. – Nayuki Nov 24 '15 at 02:22
  • @NayukiMinase - Java doesn't support tail recursion. – DaoWen Nov 24 '15 at 02:41
  • You'd implement it manually in a loop. – Nayuki Nov 24 '15 at 04:17
  • @SamNayerman - I added an answer that might explain the problem. – rcgldr Nov 24 '15 at 06:48

6 Answers6

6

Here is your quicksort:

public static void quickSort(int[] a, int p, int r)
{
    if(p<r)
    {
        int q=partition(a,p,r);
        quickSort(a,p,q);
        quickSort(a,q+1,r);
    }
}

It works, but in the worst case it uses O(r-p) stack space. That is too much for real implementations. The fix is easy, though -- you recurse on the smaller partition, and then loop for the larger partition. Recursing on the smaller partition means you use only O(log(r-p)) stack space no matter what:

public static void quickSort(int[] a, int p, int r)
{
    while(p<r)
    {
        int q=partition(a,p,r);
        if (q-p <= r-(q+1))
        {
            quickSort(a,p,q);
            p=q+1;
        }
        else
        {
            quickSort(a,q+1,r);
            r=q;
        }
    }
}

EDIT: so, this is the way that real quicksort implementations ensure that there are no stack overflows in the worst case...

BUT the worst case never happens when you initialize an array with random numbers.

You said that you initialized the array with (int)Math.random()*1000000. Check the precedence tables! The cast happens before the multiply, so this is always 0, which is why you are getting worst case behaviour. You want (int)(Math.random()*1000000).

EDIT: Your partition function is also broken. It will always leave a[p] at position p, even if it is the largest element in the array

Matt Timmermans
  • 53,709
  • 3
  • 46
  • 87
  • This just causes an infinite loop? – Sam Nayerman Nov 24 '15 at 03:04
  • No, every iteration increases p or reduces r, so it terminates. Notice: p and r on the next iteration are exactly the values you would have passed to the other recursive call. – Matt Timmermans Nov 24 '15 at 03:16
  • 1
    Strange, it's getting stuck in a infinite loop for me. Added parenthesis to the math.random()'s and still same result :/. I appreciate your help so far though! – Sam Nayerman Nov 24 '15 at 03:18
  • I think your partition() is broken. I don't have time to check it now, but I'll come back to it later. – Matt Timmermans Nov 24 '15 at 04:27
  • This is a wonderful response; really well done, both on the optimization and spotting why they're getting the worst case every time. – Dean J Nov 24 '15 at 05:50
3

You're reporting a stack overflow with an 80 thousand element array. Your code sorts an 80 million element array on my laptop with no problems in about 10 seconds. I don't see any stack overflow errors...

If you have a random input, you should expect the maximum recursion depth to be somewhere in the ballpark of log2(n), which is less than 30 for n=80,000,000. The quicksort wikipedia article has a more detailed analysis. Basically, unless you hit a really pathological case (where all your pivots are terrible), you shouldn't expect to see so much recursion that the stack overflows.

However, I did have to fix a couple of logic errors in the code in order to actually get a valid sort (I wasn't getting a fully-sorted result).


Fix your random number generation

(int)Math.random()*1000000 will always return zero. You need to add another set of parentheses to truncate after the multiplication: (int)(Math.random()*1000000).

Fix your partition logic

Your partition method almost matches the logic for the Hoare partitioning scheme. However, you seem to have a few off-by-one errors. If you compare your code with the code from wikipedia, you will see a few differences.

  1. You set i=p and j=r, but it should be i=p-1 and j=r+1.
  2. You should remove the increment and decrement in your swapping logic, so a[i++] should just be a[i], and a[j--] should just be a[j].

Here's the code I used to test this:

public class QSort {

    private static int partition(int[] a, int p, int r) {

        int x = a[p];
        int i = p-1;
        int j = r+1;

        while (true) {
            while (++i < r && a[i] < x);
            while (--j > p && a[j] > x);

            if (i < j) {
                int tmp = a[i];
                a[i] = a[j];
                a[j] = tmp;
            } else {
                return j;
            }
        }
    }

    public static void quickSort(int[] a, int p, int r) {
        if(p<r) {
            int q=partition(a,p,r);
            quickSort(a,p,q);
            quickSort(a,q+1,r);
        }
    }

    public static void main(String args[]) {
        int n = Integer.valueOf(args[0]);
        int[] xs = new int[n];
        for (int i=0; i<n; i++) {
            xs[i] = (int)(Math.random()*1000000);
        }
        quickSort(xs, 0, xs.length-1);
        for (int i=0; i<n-1; i++) {
            if (xs[i] > xs[i+1]) {
                System.out.println("ERROR");
                System.exit(-1);
            }
        }
        System.out.println("SORTED");
    }

}
DaoWen
  • 32,589
  • 6
  • 74
  • 101
  • If you use | x = a[(p+r)/2]; | then the two whiles can be: | while (a[++i] < x); | while(a[--j] >x); | . – rcgldr Nov 24 '15 at 03:43
  • So I changed what you said, and the method is working fine on its own, but it keeps bugging out and throwing stackoverflow errors on me when I test it with my driver which works by sorting the array through a timer class I created which reports the run time in milliseconds. I'll edit in the timer class and driver class into my main post. Thanks a lot for your time, regardless!! – Sam Nayerman Nov 24 '15 at 03:48
  • @SamNayerman - I ran the code you posted, and it still works fine for me (even with the driver code). However, I commented out all the other sort types (since I only have the quicksort code). Are you sure the error is coming from quicksort? Even manually setting the JVM stack size to 160k (that's the smallest it will accept on my machine), I can still sort a 80-million-element array with no error. – DaoWen Nov 24 '15 at 05:14
  • @DaoWen it's really strange because it works for me as well when I comment out all of the other sort types, and only throws errors when its run with the other sorts. Im truly stumped – Sam Nayerman Nov 24 '15 at 06:11
2

If quickSort() is still using pivot x = a[p], and not x = a[(p+r)/2], then if the data is already sorted, quickSort() will probably get stack overflow. Any chance that you're running quickSort() on data that's already been sorted by a prior sort?

rcgldr
  • 27,407
  • 3
  • 36
  • 61
  • I'm ashamed to admit that that is the case... I neglected to realize that my driver never randomizes the arrays after their initial sort. Thanks a lot! – Sam Nayerman Nov 30 '15 at 06:57
0

Since this program is recursive and java is not memory efficient in recursions that goes a bit deep, try increasing the amount of memory allocated for your program by the IDE.

Since you're using Intellij, try to increase memory as follows.

enter image description here

More Info : https://www.jetbrains.com/idea/help/run-debug-configuration-application.html

tharindu_DG
  • 8,900
  • 6
  • 52
  • 64
  • Increasing heap size won't do anything to prevent a stack overflow -- it's a different part of memory. Also, I don't think it's good advice to suggest increasing your heap limits to avoid learning how to write a proper quicksort. – Matt Timmermans Nov 24 '15 at 02:57
  • @MattTimmermans: I agree with you, but every optimized algorithm has a minimum memory requirement. We should fulfill that memory requirement to run that algorithm with out failure. In my experience when testing algorithms with large data sets, default IDE memory parameters are not the correct values. – tharindu_DG Nov 24 '15 at 03:06
0

The recursive QuickSort algorithm on large arrays causes StackOverflow error.

Try the non-recursive one from this link. Following Java code is converted from original c code, hope it helps.

static final int MAX_LEVELS = 1000;
public static boolean quickSort(int[] arr, int elements) {
    int i=0,L,R,pivot;
    int[] beg = new int[MAX_LEVELS], end = new int[MAX_LEVELS];
    beg[0]=0;
    end[0]=elements;
    while(i>=0) {
        L=beg[i];
        R=end[i]-1;
        if(L<R) {
            pivot=arr[L]; if(i==MAX_LEVELS-1) return false;
            while(L<R) {
                while(arr[R]>=pivot&&L<R) R--; if(L<R) arr[L++]=arr[R];
                while(arr[L]<=pivot&&L<R) L++; if(L<R) arr[R--]=arr[L];
            }
            arr[L]=pivot;
            beg[i+1]=L+1;
            end[i+1]=end[i];
            end[i++]=L;
        } else {
            i--;
        }
    }
    return true;
}
// an example
public static void main(String[] args) {
    // construct the integer array
    int[] arr = new int[80000];
    for(int i=0;i<arr.length;i++) {
        arr[i]=(int)Math.random()*100000;
    }

    // sort the array
    quickSort(arr, arr.length);
}

It's time-saving and stackoverflow-immune.

ddzhj
  • 82
  • 8
-2

Is your arrays inputed sorted? Did you pass a sorted array to quick sort?

public class quickSortTest {
public static void main(String[] args) {
    int max = 800000;
    int[] array = new int[max];
    for (int i = 0; i < max; ++i) {
        array[i] = (int) Math.random() * 1000000;
    }
    long start = System.currentTimeMillis();
    quickSort(array, 0, max - 1);
    System.out.println("time:"+(System.currentTimeMillis()-start));
    System.out.println(testSortResult(array));
}

public static boolean testSortResult(int[] array){
    for(int i=1;i<array.length;++i){
        if(array[i]<array[i-1]){
            return false;
        }
    }
    return true;
}

public static void quickSort(int[] a, int p, int r) {
    if (p < r) {
        int q = partition(a, p, r);
        quickSort(a, p, q);
        quickSort(a, q + 1, r);
    }
}

private static int partition(int[] a, int p, int r) {

    int x = a[p];
    int i = p;
    int j = r;

    while (true) {
        do {
            i++;
        } while (i < r && a[i] < x);
        do {
            j--;
        } while (j > p && a[j] > x);

        if (i < j) {
            int tmp = a[i];
            a[i++] = a[j];
            a[j--] = tmp;
        } else {
            return j;
        }
    }
}
}

I test your code , it's ok even when array lenght is 800000.

bluecliff
  • 67
  • 3
  • They are unsorted. The method works fine as long as the number of elements isn't too large. – Sam Nayerman Nov 24 '15 at 02:06
  • I test your code , it's ok, you can paste your complete code, too. – bluecliff Nov 24 '15 at 02:20
  • 1
    It's not OK. The problem with the original code only happens in the worst case, which is very unlikely to occur with a random array. But if you ship code like this, then it *will* happen on some customer's computer someday using real data. – Matt Timmermans Nov 24 '15 at 02:59