0

In my class our teacher wanted us to write a quicksort algorithm in C which will work on an array of 10000 int. My friends and I have written the code as seen on pseudocode.

It works okay when sorting a random valued array but when sorting sorted array code crashes as seen.

Unhandled exception at 0x00FF2509 in ConsoleApplication1.exe: 0xC0000005: Access violation writing location 0x00330F58.

Now I've searched for it a bit and found out that linker gives 1MB of stack for recursive functions (I might have gotten it wrong though). So with 4-byte integer variables (will be 4x10k of them when sorted array is being sorted again) and a 10k int referenced array, it all should take about 200KB of the stack.

So I couldn't find out why I get this error. Teacher told us that it was (of course) possible to write that code as the pseudocode. So either I'm doing something wrong on code, else I have no idea.

Can anyone please help explain what's wrong?

void quicksort_last(int *A,int p,int r){

if(p<r){
    int q=partition(A,p,r);
    quicksort_last(A,p,q-1);
    quicksort_last(A,q+1,r);
}
}

int partition(int *A,int p, int r){
const int x=A[r];
int i=p-1;
int j=p;
int temp;

while(j<r){
    if(A[j]<=x){
        i++;
        temp=A[i];
        A[i]=A[j];
        A[j]=temp;
    }
    j++;
}
temp=A[i+1];
A[i+1]=A[r];
A[r]=temp;
return i+1;
}

Edit: This is the error debug gives. Debug stops about 4000th recursion at the beginning of partition function. Remainder of the code is here.

Unhandled exception at 0x00BF2509 in ConsoleApplication1.exe: 0xC00000FD: Stack overflow (parameters: 0x00000001, 0x002B2FA8).

Jonathan Leffler
  • 730,956
  • 141
  • 904
  • 1,278
umurcan
  • 1
  • 3
  • https://dl.dropboxusercontent.com/u/64847206/lab.c don't mind the .c extension – umurcan Oct 26 '13 at 13:56
  • how about 10k sized arrays :) – umurcan Oct 26 '13 at 14:21
  • please indent your code properly – Jens Gustedt Oct 26 '13 at 15:10
  • 1
    I usually do but for some reason it's not in same indentation i originally made. If you look above you can see the original code. By the way it's not very readable i know but these codes are written in limited time – umurcan Oct 26 '13 at 15:24
  • When time's limited, it is even more important to write code cleanly the first time around so that you can see what you're debugging and don't have to waste time cleaning up. – Jonathan Leffler Oct 26 '13 at 15:53
  • 1
    @umurcan, that's a lame excuse. Most people here also have limited time. Be friendly to them (and yourself) so they might be friendly to you. – Jens Gustedt Oct 26 '13 at 15:53
  • If your input data is already sorted, choosing the first element of the sub-array as the pivot value leads to quadratic sorting and very deep recursion. It is better to choose a pivot at random, or use [Median of Three](http://stackoverflow.com/questions/164163/quicksort-choosing-the-pivot/), or related techniques. – Jonathan Leffler Oct 26 '13 at 15:56
  • What operating system are you using ? The error might come from user-space limitations – damienfrancois Oct 26 '13 at 19:22
  • @JensGustedt i am aware that it's not a valid excuse but not much i can do about it since i am graded over working codes but not indentation. But i am looking up to learn new things so if you can point out the wrong things on full source that would be great :) As i looked original code was indented properly. Only comment lines make some mess though – umurcan Oct 27 '13 at 18:24
  • @JonathanLeffler we have discussed picking pivot random and it was already prooved it would work in good time and we were all aware that the sorted groups of numbers was the worst-case for quicksort but as you can see on full code that was an experiment for seeing it's running time :) – umurcan Oct 27 '13 at 18:27
  • @damienfrancois i am using win7, x64 but when i looked for what you said i found out that linker stuff i explained above – umurcan Oct 27 '13 at 18:29

3 Answers3

0

Why do you think of stack limitation if your exception states access violation? There surely is a bug in your code which results in that you write to forbidden memory, which doesn't belong to your application. Did you run in a debugger? It should tell you at least in which iteration the error occurs.

Thorsten Schöning
  • 3,501
  • 2
  • 25
  • 46
0

Synopsis

Your transcription and implementation of the algorithm seems to be OK; it works for me without problems.

You need to scrutinize you driver code to make sure you aren't overstepping the bounds of any of your arrays. The chances are that you will find a problem in that driver code.

Analysis

Having worked a bit with the sort and partition code you show, I don't think there's a major problem with it. I used the following test case, but was testing on Mac OS X 10.9 with GCC 4.8.2. It was compiled with -std=c11 (amongst other options stringent options), and uses two features — the 'declare variables in for loops feature and the 'declare variables when needed' feature — that were added to C99. You can fix those straight-forwardly if you don't use a C99-capable compiler.

#include <stdio.h>
#include <stdlib.h>
#include <time.h>

static void dump_partition(char const *tag, int *array, int lo, int hi)
{
    if (lo < hi)
    {
        int i;
        printf("%s: %d..%d\n", tag, lo, hi);
        for (i = lo; i <= hi; i++)
        {
            printf(" %4d", array[i]);
            if ((i - lo) % 10 == 9)
                putchar('\n');
        }
        if ((i - lo) % 10 != 0)
            putchar('\n');
    }
}

static int partition(int *A, int p, int r)
{
    const int x=A[r];
    int i=p-1;
    int j=p;
    int temp;

    while (j < r)
    {
        if (A[j] <= x)
        {
            i++;
            temp=A[i];
            A[i]=A[j];
            A[j]=temp;
        }
        j++;
    }
    temp=A[i+1];
    A[i+1]=A[r];
    A[r]=temp;
    return i+1;
}

static void quicksort_last(int *A, int p, int r)
{
    if (p < r)
    {
        int q=partition(A, p, r);
        printf("quicksort: %p (%d..%d)\n", (void *)&q, p, r);
        //dump_partition("L-part", A, p, q-1);
        //dump_partition("R-part", A, q+1, r);
        quicksort_last(A, p, q-1);
        quicksort_last(A, q+1, r);
    }
}

int main(void)
{
    int data[] =
    {
        31, 14, 53, 45, 88,  0, 79, 59, 84,  5,
        83, 42, 61, 38, 24, 47, 86, 69,  8, 36,
    };
    enum { N_DATA = sizeof(data) / sizeof(data[0]) };

    dump_partition("Random", data, 0, N_DATA-1);
    quicksort_last(data, 0, N_DATA-1);
    dump_partition("Sorted", data, 0, N_DATA-1);

    enum { BIG_SIZE = 10000 };
    int data2[BIG_SIZE];
    srand(time(0));
    for (int i = 0; i < BIG_SIZE; i++)
        data2[i] = rand() % BIG_SIZE;

    dump_partition("Random", data2, 0, BIG_SIZE-1);
    quicksort_last(data2, 0, BIG_SIZE-1);
    dump_partition("Sorted", data2, 0, BIG_SIZE-1);

    return 0;
}

The dump_partition() function allows me to monitor what's in the partitions. When the commented out ones in quicksort_last() were active, it allowed me to see that the partitioning was working correctly. The printf() that prints the address of q gives a measure of stack depth. On my machine, the output from running:

qs | grep quicksort: | sort -u -k2,2

was:

quicksort: 0x7fff555f66f0 (3962..3963)
quicksort: 0x7fff555f6730 (3961..3963)
quicksort: 0x7fff555f6770 (3961..3965)
quicksort: 0x7fff555f67b0 (1214..1215)
quicksort: 0x7fff555f67f0 (1197..1198)
quicksort: 0x7fff555f6830 (1151..1152)
quicksort: 0x7fff555f6870 (1150..1152)
quicksort: 0x7fff555f68b0 (865..867)
quicksort: 0x7fff555f68f0 (435..436)
quicksort: 0x7fff555f6930 (433..436)
quicksort: 0x7fff555f6970 (20..21)
quicksort: 0x7fff555f69b0 (20..22)
quicksort: 0x7fff555f69f0 (20..23)
quicksort: 0x7fff555f6a30 (20..28)
quicksort: 0x7fff555f6a70 (20..29)
quicksort: 0x7fff555f6ab0 (20..30)
quicksort: 0x7fff555f6af0 (1..2)
quicksort: 0x7fff555f6b30 (1..4)
quicksort: 0x7fff555f6b70 (0..4)
quicksort: 0x7fff555f6bb0 (0..6)
quicksort: 0x7fff555f6bf0 (0..11)
quicksort: 0x7fff555f6c30 (0..18)
quicksort: 0x7fff555f6c70 (0..93)
quicksort: 0x7fff555f6cb0 (0..138)
quicksort: 0x7fff555f6cf0 (8..9)
quicksort: 0x7fff555f6d30 (7..9)
quicksort: 0x7fff555f6d70 (3..4)
quicksort: 0x7fff555f6db0 (0..1)
quicksort: 0x7fff555f6df0 (0..5)
quicksort: 0x7fff555f6e30 (0..19)

The maximum stack depth was:

  0x7fff555f6e30
- 0x7fff555f66f0
  --------------
  0x000000000740

which is less than 2 KiB. There are 28 levels on the stack. That's with the random data (the code shown). When I revised the code to sort the already sorted data, then the stack used was a lot greater — as I noted in a comment, that leads to very bad behaviour in the partitioning.

If your input data is already sorted, choosing the first element of the sub-array as the pivot value leads to quadratic sorting and very deep recursion. It is better to choose a pivot at random, or use Median of Three, or related techniques.

There were 9999 levels on the stack, and difference in the stack positions was:

  0x7fff5d0bde30
- 0x7fff5d021ab0
  --------------
  0x000000013074

However, that's still less than 80 KiB of stack space used (in the sorting code; the space for the arrays is extra, but that's a known quantity, and roughly 40 KiB). None of these sizes should stress an ordinary machine.

Consequently, I have to diagnose that the problem is not the code you showed in the question. This surprisingly often turns out to be the case.

You can verify this by taking my driver code (main() and maybe the dump_partition() function) and running that with your quicksort_last(). You should find similar results to what I got. If that's the case, you can start working on your driver code. I took a look at it and decided I didn't want to scrutinize it — indeed, the code originally posted does not compile at all for me. It also seems to have large swathes of repeated code, which is always a bad sign. When I did enough work to get it to compile warning-free (which meant printing out the carefully calculated times, in large part, but there were other problems too), then the output I got was:

 0.000787
 0.156366
 0.000001
 0.031464
 0.000001
 0.040427
 0.001198
 0.597619
 0.000001
 0.121826
 0.000000
 0.159727
 0.001914
 1.335059
 0.000001
 0.275667
 0.000002
 0.358740
 0.002504
 2.381662
 0.000000
 0.487816
 0.000000
 0.645867

This was printing the time using %13.6f as the format string. Again, it did not crash on Mac OS X 10.9. I did not measure stack usage in this code.

Community
  • 1
  • 1
Jonathan Leffler
  • 730,956
  • 141
  • 904
  • 1,278
  • Can you change the for loop in your main function to; `for (int i = 0; i < BIG_SIZE; i++) data2[i] = i;` and tell me what happens? I know that quicksort will work better when numbers are random but i must make it work on worst-case :) By the way i want to thank you for your answer. It must have took some time and sure have gained me a new different perspective :) – umurcan Oct 27 '13 at 18:42
  • By the way i forgot to tell that when i did what i asked you, program crashed again with same error – umurcan Oct 27 '13 at 18:49
  • I effectively did that when I re-sorted the already sorted data. However, I also did as requested, and the code worked fine for me, but I got a different measure on the amount of stack space used. The raw figures were `0x7fff51ba3e30 - 0x7fff51b07ab0 = 0x9C380 = 639872` (about 625 KiB), which is considerably larger than I determined yesterady. I'd have to rerun the original code to check whether I goofed somehow yesterday. – Jonathan Leffler Oct 27 '13 at 20:39
0

Well, the last thing i did was the first thing i should've done(as usual)

void plusplus(int a){
    printf("%d\n",a);
    plusplus(a+1);
}

int main(){
    plusplus(0);
    return 0;
}

When i run this code, it cracks around 4700th recursion. So i am getting that, either the 1 mb stack is filled in 4700th depth or there is a limitation(which i doubt).

Thanks for your time and help everyone ;)

umurcan
  • 1
  • 3