2

I have a problem with the Quick Sort algorithm that I'm trying to implement.

I take a course of Fundamental Algorithms and we're provided for the laboratory assignments with pseudocode for various argorithms to implement. These algorithms are taken from Cormen and assimilated to C++ language and we're supposed to verify efficiency and generate charts for the number of assignments and comparisons within.

Now the question:

The following code is supposed to make a Quick Sort on an array of 10000 numbers and work with it in the Best Case scenario (taking the pivot of the array always at the middle):

int partition(int *a, int p, int r) {
    int x = a[r];
    countOpQS++;
    int index = p - 1;
    for (int count = p; count <= (r - 1); count++) {
        if (a[count] <= x) {
            index += 1;
            swap(a[index], a[count]);
            countOpQS += 3;
        }
        countOpQS++;
    }
    swap(a[index + 1], a[r]);
    countOpQS += 3;
    return (index + 1);
}

int select(int *a, int p, int r, int index) {
    if (p == r) {
        return a[p];
    }
    int q;
    q = partition(a, p, r);
    //countOpQS++;
    int k = q - p + 1;
    if (index <= k) {
        return select(a, p, q - 1, index);
    } else {
        return select(a, q + 1, r, index - k);
    }
}


void bestQuickSort(int *a, int p, int r) {
    if (p < r) {
        select(a, p, r, (r - p + 1) / 2);
        bestQuickSort(a, p, (r - p + 1) / 2);
        bestQuickSort(a, ((r - p + 1) / 2) + 1, r);
    }
}

The call in the main function is done by:

for (index = 100; index <= 10000; index += 100) {
        countOpQS = 0;
        for (int k = 0; k < index; k++) {
            a[k] = rand();
        }
        bestQuickSort(a, 1, index);
        out3 << index << ", " << countOpQS << "\n";
    }

It should be doable with these methods, but it jumps into stack overflow pretty quickly while running. I even raised the reserved stack in Visual Studio, due to it being a necessity while going into the worst case possible (already ordered array, random pivot).

Do you guys have any idea of why it doesn't work?

  • You don't show where array (or vector) `a` is defined, but since you initialize it with indexes in the range 0..index-1 that indicates it contains `index` elements... notice that you pass `index` to bestQuickSort, which passes it as `r` to `select` and then to `partition` which then references a[r], which is a[index] which is one past the end of your array/vector. I'd start by fixing that. – amdn Mar 22 '14 at 18:29
  • @amdn Array a is declared in the main function. The for loop just generates the numbers and outputs them in a .csv file. Now is it necessary? Think of it this way: I'm taking an array of 100 elements (from 0 to 99). The numbers at the call will just be the index of the first and the last element in the array to be sorted... – alin.florian Mar 22 '14 at 18:42
  • `Best Case scenario (taking the pivot of the array always at the middle)` Note that this is only the best case if the input is already sorted (or reverse sorted order) – AliciaBytes Mar 22 '14 at 18:45
  • Yes, it was a mistake on my part. The code above was saved separately. The inside for statement looks like ' a[k] = k ', making it being already sorted. So it's still the same. – alin.florian Mar 22 '14 at 18:52
  • 1
    It shouldn't go more than 14 levels deep, and each level you go down should cut the array about in half. Step through it in a debugger. – Mike Dunlavey Mar 23 '14 at 00:54
  • (1) Arrays are indexed from 0. Period, full stop, end of story. If you resist the fact, you will *always* have bugs. (2) Your index calculations are broken. `((r - p + 1) / 2)` is not a midpoint between r and p. – n. m. could be an AI Mar 23 '14 at 06:23

1 Answers1

0

Firstly, you should know that your function select() rearranges the elements in the range [p, r], in such a way that the element at the index-th(note that index is one-based!) position is the element that would be in that position in a sorted sequence, just as std::nth_element does.
So when you chose the median element of the subarray by select(a, p, r, (r - p + 1) / 2);, the index of median is based on p.
For example: when p = 3, r = 5, so (r - p + 1) / 2 is 1, the median would be placed in a[4], it means you should call the function like this: select(a, 3, 5, 2). And after that, you just call the bestQuickSort() like this:

    bestQuickSort(a, p, (r - p + 1) / 2); // (r - p + 1) / 2 is 1 now!
    bestQuickSort(a, ((r - p + 1) / 2) + 1, r);

of course it doesn't work! The whole code for this is:

int select(int *a, int p, int r, int index) {
    if (p == r) {
        return a[p];
    }
    int q;
    q = partition(a, p, r);
    //countOpQS++;
    int k = q - p + 1;
    if(k== index)
        return a[q];
    else if (index <= k) {
        return select(a, p, q - 1, index);
    } else {
        return select(a, q + 1, r, index - k);
    }
}

void bestQuickSort(int *a, int p, int r) {
    if (p < r) {
        select(a, p, r, (r - p + 1) / 2 + 1); // the index passed to select is one-based!

        int midpoint = p + (r - p + 1) / 2; 
        bestQuickSort(a, p, midpoint - 1);
        bestQuickSort(a, midpoint + 1, r);
    }
}

BTW, your version of quicksort didn't always run in best case, though every time you choose the exact median of the (sub)array, but the time complexity of select is not always O(n) since you simply choose the a[r] as the pivot, the worst-case performance of select is quadratic: O(n*n).

jfly
  • 7,715
  • 3
  • 35
  • 65
  • Now I'm not saying it's somehow logic for me what you're claiming above, but your piece of code makes me hit a case worse than Worst Case... That counter you see there is to count the assignments and the operations done within the functions. And it's giving me values like 4-5 times higher than worst case for an already sorted array. – alin.florian Mar 23 '14 at 19:38
  • How do you count the assignments in worst case? I didn't modify the logic of your program, just fixed some errors. And you should pass 0-based index to `bestQuickSort()` in the main function, so it's `bestQuickSort(a, 0, index-1)`. – jfly Mar 24 '14 at 03:35
  • some people like to shuffle their arrays (randomly) before passing them into QuickSort to avoid the "already sorted" or "almost sorted" case. Also, standard implementations of qsort use different median picking heuristics depending on array size, and for small sizes they even switch to bubble sort. Now about stack overflow, a great compiler will remove actual stacking of recursions by transforming it into an iteration. eg: http://stackoverflow.com/questions/310974/what-is-tail-call-optimization – v.oddou Mar 24 '14 at 03:38
  • @v.oddou what you say is not related with the OP's question, and the stack overflow is due to the wrong code which leads to infinite recursion. – jfly Mar 24 '14 at 03:45
  • @jfly: that's quite right. like Mike Dunlavey said it, the recursion depth should float around log(N) so very little anyway. – v.oddou Mar 24 '14 at 03:47
  • @alin.florian I'd suggest you copy the elements in the array to a `std::vector`, then call `std::sort` to sort the vector, so you can check whether the array is sorted. – jfly Mar 24 '14 at 13:59