1

So when I test the algorithm with that huge array, it works fine; sorts everything perfectly. But when I test it with the same array already sorted, I get stackOverFlow. I read similar questions on this, but I couldn't really get a grasp of people's explanations in those questions. How can I fix it?

This is the quick sort implementation:

private static void quickSort(String[] arr, int start, int end) {
    
    if (end <= start) return;

    int pivot = partition(arr, start, end);
    quickSort(arr, start, pivot - 1);
    quickSort(arr, pivot + 1, end);

}

private static int partition(String[] arr, int start, int end) {
    
    String pivot = arr[end];
    int i = start - 1;

    for (int j = start; j < end; ++j)
    {
        if (arr[j].compareTo(pivot) < 0)
        {
            ++i;
            String temp = arr[i];
            arr[i] = arr[j];
            arr[j] = temp;
        }
    }
    ++i;
    String temp = arr[i];
    arr[i] = pivot;
    arr[end] = temp;

    return i;
}

Part where code is tested:

List<String> arr = Files.readAllLines(Paths.get("dictionary.txt"), StandardCharsets.UTF_8);

String[] arr2 = arr.toArray(new String[arr.size()]);
String[] arr3 = arr2.clone();

long startTime41 = System.nanoTime();
Sorting.quickSort(arr3);
long endTime41 = System.nanoTime();
long timeLapsed41 = endTime41 - startTime41;

Sorting.quickSort(arr3); // StackOverFlow when sorting here
Damien
  • 8,889
  • 3
  • 32
  • 40
  • Just from your description, you might be running into this error because of too many recursive calls on the ```quickSort``` method due to a 'bad' partitioning of your dataset during the sorting runs. Check out [this post](https://stackoverflow.com/questions/18368406/java-lang-stackoverflowerror-due-to-recursion) for more information. – Alphaharrius Mar 14 '23 at 19:07
  • thanks, with that logic I found a way to avoid so many recursive calls during execution. I fixed it! – Sheep_Walker Mar 14 '23 at 20:06
  • Can you tell which line is concerned by the stackoverflow ? – Elikill58 Mar 17 '23 at 13:41
  • this one: `int pivot = partition(arr, start, end);`. I think the program gave me that as the problem for stack-overflow because after quick sort called itself a last time, there was no more space available in the stack, and since calling the partition method puts one more method call in the stack, and we don't have more available space, it just collapses – Sheep_Walker Mar 17 '23 at 14:52

2 Answers2

2

So having elements already sorted and choosing always as a pivot the last element would mean that every time you recurse and partition the elements, all of the elements of the array are absolutely going to be less than the pivot and no element would be greater than it. This would cause unbalanced partitions, and so the line of code of the low partition quickSort(arr, start, pivot - 1); would have to make a lot more recursive calls than the normal causing a stack-overflow. This is the worst case scenario in Quick Sort (O(n^2)).

To fix it, what I did was making the algorithm choose random pivots instead of choosing always the last element as the pivot in the following fashion:

private static int partition(String[] arr, int start, int end) {
        int randomIndex = new Random().nextInt(end - start + 1) + start; // random pivot to avoid worst case scenario more often during recursion (when array is already sorted)
        String pivot = arr[randomIndex];
        int i = start - 1;

        String temp = arr[randomIndex];
        arr[randomIndex] = arr[end];
        arr[end] = temp;

        for (int j = start; j < end; ++j)
        {
            if (arr[j].compareTo(pivot) < 0)
            {
                ++i;
                temp = arr[i];
                arr[i] = arr[j];
                arr[j] = temp;
            }
        }
        ++i;
        temp = arr[i];
        arr[i] = pivot;
        arr[end] = temp;

        return i;
    }

More explanation:

  • Why nextInt(end - start + 1) + start;?

We need to choose the element at a random index within that specific partition of the array. This parameter will generate a number between 0 to the last index end - start of the partition (not inclusive). To include the last index as a possible pivot. We must add one. We then add to the result the smallest index because we do not want indexes of the low partition being included. When the method is doing the high partition, the lowest index is not going to be 0 but a higher number. When we add the lowest index to the result, we exclude all the numbers lower than that number to being possible pivots.

  • Why do we swap before entering the for loop?

Because we are still following the same logic as before; that of incrementing i every time we find a greater element than the pivot without moving the pivot from the last index and changing it accidentally. Then, at the very end, we swap the last element where the pivot remained safely from any swaps with the subsequent index of the last smallest element than the pivot.

To perform the previous, we must make sure than our pivot element is indeed in the last index before any swaps. That is why before entering the for loop, we swap.

1

The stack overflow can be prevented by only using recursion on the smaller partition, and looping back for the larger partition, with stack space complexity O(log2(n)), but the worst case time complexity remains the same at O(n^2), and 600000 elements may take too long:

private static void quickSort(String[] arr, int start, int end) {
    while(end > start){
        int pivot = partition(arr, start, end);
        if((pivot - start) <= (end - pivot)){
            quickSort(arr, start, pivot - 1);
            start = pivot+1;
        } else {
            quickSort(arr, pivot + 1, end);
            end = pivot-1;
        }
    }
}

A better choice for pivot would be to use the middle value, median of three, or random pivot as shown in Sheep_Walker's answer.

rcgldr
  • 27,407
  • 3
  • 36
  • 61
  • Thanks, that's what I tried to do after reading other stackoverflow posts, but I couldn't really understand; especially the part where you add the lowest/largest index the pivot +/- 1.... Now seeing that solution in my code with my variables and all, everything clicks better haha. – Sheep_Walker Mar 16 '23 at 13:25