Quicksort vs heapsort

Question

Both quicksort and heapsort do in-place sorting. Which is better? What are the applications and cases in which either is preferred?

possible duplicate of [Quicksort superiority over Heap Sort](http://stackoverflow.com/questions/1853208/quicksort-superiority-over-heap-sort) — Bernhard Barker, Jun 16 '14 at 12:09

score 169 · Answer 1 · edited May 06 '18 at 18:43

Heapsort is O(N log N) guaranted, what is much better than worst case in Quicksort. Heapsort doesn't need more memory for another array to putting ordered data as is needed by Mergesort. So why do comercial applications stick with Quicksort? What Quicksort has that is so special over others implementations?

I've tested the algorithms myself and I've seen that Quicksort has something special indeed. It runs fast, much faster than Heap and Merge algorithms.

The secret of Quicksort is: It almost doesn't do unnecessary element swaps. Swap is time consuming.

With Heapsort, even if all of your data is already ordered, you are going to swap 100% of elements to order the array.

With Mergesort, it's even worse. You are going to write 100% of elements in another array and write it back in the original one, even if data is already ordered.

With Quicksort you don't swap what is already ordered. If your data is completely ordered, you swap almost nothing! Although there is a lot of fussing about worst case, a little improvement on the choice of pivot, any other than getting the first or last element of array, can avoid it. If you get a pivot from the intermediate element between first, last and middle element, it is suficient to avoid worst case.

What is superior in Quicksort is not the worst case, but the best case! In best case you do the same number of comparisons, ok, but you swap almost nothing. In average case you swap part of the elements, but not all elements, as in Heapsort and Mergesort. That is what gives Quicksort the best time. Less swap, more speed.

The implementation below in C# on my computer, running on release mode, beats Array.Sort by 3 seconds with middle pivot and by 2 seconds with improved pivot (yes, there is an overhead to get a good pivot).

static void Main(string[] args)
{
    int[] arrToSort = new int[100000000];
    var r = new Random();
    for (int i = 0; i < arrToSort.Length; i++) arrToSort[i] = r.Next(1, arrToSort.Length);

    Console.WriteLine("Press q to quick sort, s to Array.Sort");
    while (true)
    {
        var k = Console.ReadKey(true);
        if (k.KeyChar == 'q')
        {
            // quick sort
            Console.WriteLine("Beg quick sort at " + DateTime.Now.ToString("HH:mm:ss.ffffff"));
            QuickSort(arrToSort, 0, arrToSort.Length - 1);
            Console.WriteLine("End quick sort at " + DateTime.Now.ToString("HH:mm:ss.ffffff"));
            for (int i = 0; i < arrToSort.Length; i++) arrToSort[i] = r.Next(1, arrToSort.Length);
        }
        else if (k.KeyChar == 's')
        {
            Console.WriteLine("Beg Array.Sort at " + DateTime.Now.ToString("HH:mm:ss.ffffff"));
            Array.Sort(arrToSort);
            Console.WriteLine("End Array.Sort at " + DateTime.Now.ToString("HH:mm:ss.ffffff"));
            for (int i = 0; i < arrToSort.Length; i++) arrToSort[i] = r.Next(1, arrToSort.Length);
        }
    }
}

static public void QuickSort(int[] arr, int left, int right)
{
    int begin = left
        , end = right
        , pivot
        // get middle element pivot
        //= arr[(left + right) / 2]
        ;

    //improved pivot
    int middle = (left + right) / 2;
    int
        LM = arr[left].CompareTo(arr[middle])
        , MR = arr[middle].CompareTo(arr[right])
        , LR = arr[left].CompareTo(arr[right])
        ;
    if (-1 * LM == LR)
        pivot = arr[left];
    else
        if (MR == -1 * LR)
            pivot = arr[right];
        else
            pivot = arr[middle];
    do
    {
        while (arr[left] < pivot) left++;
        while (arr[right] > pivot) right--;

        if(left <= right)
        {
            int temp = arr[right];
            arr[right] = arr[left];
            arr[left] = temp;

            left++;
            right--;
        }
    } while (left <= right);

    if (left < end) QuickSort(arr, left, end);
    if (begin < right) QuickSort(arr, begin, right);
}

+1 for considerations on the no. of swap, read/write operations required for different sorting algorithms — Higgs, Jun 14 '15 at 09:04
For any deterministic, constant time pivot selection strategy, you can find an array that produces the O(n^2) worst case. It's not enough to eliminate just the minimum. You have to reliably chose pivots that are within a certain pecrentile band. — Antimony, Jul 24 '16 at 01:04
I'm curious if this is the exact code you ran for your simulations between your hand-coded quick sort and C# built-in Array.sort? I tested this code and in all my tests, at best the hand-coded quick sort was the same as Array.sort. One thing I controlled for in my testing of this was to make two identical copies of the random array. After all, a given randomization could potentially be more favorable (lean towards best case) than another randomization. So I ran the identical sets through each one. Array.sort tied or beat every time (release build btw). — Chris, Nov 12 '17 at 19:35
Ok, after almost 3 years we can expect the framework has improved and a simple implementation like mine can only be equal. Also nowadays we have a more improved quick sort, and maybe .net framework has adopted it, as java framework did: the Dual pivot Quicksort - But the differences are really small, and the quicksort can be considered, by his simplicity, beauty and efficiency, one of the best sorting algorithms. — Marquinho Peli, Nov 14 '17 at 02:13
Merge sort doesn't have to copy 100% of the elements, unless it's some very naive implementation from a textbook. It's simple to implement it so that you only need to copy 50% of them (the left side of the two merged arrays). It's also trivial to postpone copying until you actually have to "swap" two elements, so with already sorted data you won't have any memory overhead. So even the 50% is actually the worst case, and you can have anything between that and 0%. — ddekany, Dec 10 '17 at 10:48
Hi @ddekany, please point out some examples of the naive vs expert implementation of merge sort so people can confirm your assertions. From what I saw in college books, merge sort was doing this back and forth copy and these Swap/IO operations slow down the algorithm. Also from what I see here: stackoverflow.com/questions/7576521/… a lot of copying between one array and another is going on. — Marquinho Peli, Mar 15 '18 at 18:27
@MarquinhoPeli I meant to say that you only need 50% more available memory compared to the size of the sorted list, not 100%, which seems to be a common misconception. So I was talking about the peak memory usage. I can't give a link, but it's easy to see if you try to merge the two already sorted half of an array in place (only the left half has the problem where you overwrite elements that you haven't yet consumed). How much memory copying you have to do during the whole sorting process is another question, but obviously the worst case can't be below 100% for any sorting algorithm. — ddekany, Mar 16 '18 at 06:43
@Antimony In practice, nobody cares about worst case as even randomized pivot selection will take care of it. When your list is large (which is when it matters) the selection of a bad pivot is unlikely, though admittedly not exponentially unlikely. — Nimrod, Oct 12 '18 at 22:02
@Nimrod the selection of a bad pivot is 100% if someone is trying to DOS your server. — Antimony, Oct 13 '18 at 23:05

score 70 · Accepted Answer · edited Apr 05 '19 at 22:55

70

This paper has some analysis.

Also, from Wikipedia:

The most direct competitor of quicksort is heapsort. Heapsort is typically somewhat slower than quicksort, but the worst-case running time is always Θ(nlogn). Quicksort is usually faster, though there remains the chance of worst case performance except in the introsort variant, which switches to heapsort when a bad case is detected. If it is known in advance that heapsort is going to be necessary, using it directly will be faster than waiting for introsort to switch to it.

edited Apr 05 '19 at 22:55

Politank-Z

3,653
3
24
28

answered Mar 18 '10 at 05:48

DVK

126,886
32
213
327

15

It might be important to note that in typical implementations, neither quicksort nor heapsort are stable sorts. – Femi Mar 30 '14 at 14:46
@DVK, According to your link https://www.cs.auckland.ac.nz/~jmor159/PLDS210/qsort3.html, the heap sort takes 2,842 comparisons for n=100, but it takes 53,113 comparisons for n=500. And that implies the ratio between n=500 and n=100 is 18 times, and it is NOT matching the heap sort algorithm with O(N logN) complexity. I guess it's pretty likely that their heap sort implementation has some kind of bugs inside. – DU Jiaen Oct 18 '17 at 07:29
@DUJiaen - remember that O() is about asymptotic behavior at large N and has a possible multiplier – DVK Oct 18 '17 at 11:26
This is NOT related to the multiplier. If an algorithm has a complexity of O(N log N), it should follow a trend of Time(N) = C1 * N * log(N). And if you take Time(500)/Time(100), it's obvious that C1 will be disappeared and the result should be closed to (500 log500) / (100 log100) = 6.7 But from your link, it is 18, which is too much out the of scale. – DU Jiaen Oct 19 '17 at 08:26
2

The link is dead – PlsWork Mar 04 '18 at 12:45
@DUJiaen n=500 is not large enough to conclude that the statistics don't agree with predicted asymptotic behavior – Nic Szerman Feb 19 '20 at 16:33

score 16 · Answer 3 · answered Oct 03 '11 at 05:10

For most situations, having quick vs. a little quicker is irrelevant... you simply never want it to occasionally get waayyy slow. Although you can tweak QuickSort to avoid the way slow situations, you lose the elegance of the basic QuickSort. So, for most things, I actually prefer HeapSort... you can implement it in its full simple elegance, and never get a slow sort.

For situations where you DO want max speed in most cases, QuickSort may be preferred over HeapSort, but neither may be the right answer. For speed-critical situations, it is worth examining closely the details of the situation. For example, in some of my speed-critical code, it is very common that the data is already sorted or near-sorted (it is indexing multiple related fields that often either move up and down together OR move up and down opposite each other, so once you sort by one, the others are either sorted or reverse-sorted or close... either of which can kill QuickSort). For that case, I implemented neither... instead, I implemented Dijkstra's SmoothSort... a HeapSort variant that is O(N) when already sorted or near-sorted... it is not so elegant, not too easy to understand, but fast... read http://www.cs.utexas.edu/users/EWD/ewd07xx/EWD796a.PDF if you want something a bit more challenging to code.

Jack D'Aurizio · Answer 4 · 2013-01-21T15:46:59.377

Quicksort-Heapsort in-place hybrids are really interesting, too, since most of them only needs n*log n comparisons in the worst case (they are optimal with respect to the first term of the asymptotics, so they avoid the worst-case scenarios of Quicksort), O(log n) extra-space and they preserve at least "a half" of the good behaviour of Quicksort with respect to already-ordered set of data. An extremely interesting algorithm is presented by Dikert and Weiss in http://arxiv.org/pdf/1209.4214v1.pdf:

Select a pivot p as the median of a random sample of sqrt(n) elements (this can be done in at most 24 sqrt(n) comparisons through the algorithm of Tarjan&co, or 5 sqrt(n) comparisons through the much more convoluted spider-factory algorithm of Schonhage);
Partition your array in two parts as in the first step of Quicksort;
Heapify the smallest part and use O(log n) extra bits to encode a heap in which every left child has a value greater than its sibling;
Recursively extract the root of the heap, sift down the lacune left by the root until it reaches a leaf of the heap, then fill the lacune with an appropriate element took from the other part of the array;
Recur over the remaining non-ordered part of the array (if p is chosen as the exact median, there is no recursion at all).

score 2 · Answer 5 · edited Sep 26 '12 at 17:41

Comp. between quick sort and merge sort since both are type of in place sorting there is a difference between wrost case running time of the wrost case running time for quick sort is O(n^2) and for heap sort it is still O(n*log(n)) and for a average amount of data quick sort will be more useful. Since it is randomized algorithm so the probability of getting correct ans. in less time will depend on the position of pivot element you choose.

So a

Good call: the sizes of L and G are each less than 3s/4

Bad call: one of L and G has size greater than 3s/4

for small amount we can go for insertion sort and for very large amount of data go for heap sort.

Although merge sort can be implemented with in-place sorting, the implementation is complex. AFAIK, most merge sort implementations are not in-place, but they are stable. — Femi, Mar 30 '14 at 14:41

score 2 · Answer 6 · edited Feb 15 '15 at 16:45

2

Heapsort has the benefit of having a worst running case of O(n*log(n)) so in cases where quicksort is likely to be performing poorly (mostly sorted data sets generally) heapsort is much preferred.

edited Feb 15 '15 at 16:45

gvlasov

18,638
21
74
110

answered Mar 18 '10 at 05:48

zellio

31,308
1
42
61

4

Quicksort only performs poorly on a mostly sorted data set if a poor pivot choosing method is chosen. Namely, the bad pivot choosing method would be to always choose the first or last element as the pivot. If a random pivot is chosen each time and a good method of handling repeated elements is used, the chance of a worst-case quicksort is very small. – Justin Peel Mar 18 '10 at 19:52
1

@Justin - That is very true, I was speaking on a naive implementation. – zellio Mar 18 '10 at 20:30
1

@Justin: True, but the chance of a major slowdown is always there, however slight. For some applications, I might want to ensure O(n log n) behavior, even if it's slower. – David Thornley Mar 24 '10 at 18:50
@JustinPeel On a very large amount of random numbers in a small range (8 or 16 bit unsigned integers, for example) quicksort will always hit its worst case performance regardless of your choice of pivot – GDI512 Aug 28 '21 at 18:32

csevcik · Answer 7 · 2016-04-16T00:13:15.493

2

To me there is a very fundamental difference between heapsort and quicksort: the latter uses a recursion. In recursive algorithms the heap grows with the number of recursions. This does not matter if n is small, but right now I am sorting two matrices with n=10^9 !!. The program takes almost 10 GB of ram and any extra memory will make my computer to start swapping to virtual disk memory. My disk is a RAM disk, but still swapping to it make a huge difference in speed. So in a statpack coded in C++ that includes adjustable dimension matrices, with size unknown in advance to the programmer, and nonparametric statistical kind of sorting I prefer the heapsort to avoid delays to uses with very big data matrices.

edited Apr 16 '16 at 00:13

answered Apr 16 '16 at 00:08

csevcik

21
2

2

You only need O(logn) memory on average. The recursion overhead is trivial, assuming you don't get unlucky with the pivots, in which case you have bigger problems to worry about. – Antimony Jul 24 '16 at 01:09
Quicksort is not necessarily recursive. In fact, it's always possible to transform a recursive algorithm into a non recursive one. Sure, the classical presentations of QS all involve recursion, but it ain't necessarily so in practice. – gniourf_gniourf Nov 24 '20 at 12:32

score 2 · Answer 8 · answered May 20 '16 at 21:03

Well if you go to architecture level...we use queue data structure in cache memory.so what ever is available in queue will get sorted.As in quick sort we have no issue dividing the array into any lenght...but in heap sort(by using array) it may so happen that the parent may not be present in the sub array available in cache and then it has to bring it in cache memory ...which is time consuming. That's quicksort is best!!

score 1 · Answer 9 · answered Mar 18 '10 at 07:43

1

Heapsort builds a heap and then repeatedly extracts the maximum item. Its worst case is O(n log n).

But if you would see the worst case of quick sort, which is O(n2), you would realized that quick sort would be a not-so-good choice for large data.

So this makes sorting is an interesting thing; I believe the reason so many sorting algorithms live today is because all of them are 'best' at their best places. For instance, bubble sort can out perform quick sort if the data is sorted. Or if we know something about the items to be sorted then probably we can do better.

This may not answer your question directly, thought I'd add my two cents.

answered Mar 18 '10 at 07:43

KMån

9,896
2
31
41

1

Never use bubble sort. If you reasonably think that your data will be sorted, then you can use insertion sort, or even test the data to see if they are sorted. Don't use bubblesort. – vy32 Feb 28 '14 at 01:32
if you have a very large RANDOM data set, your best bet is quicksort. If partially ordered, then not, but if you start working with huge datasets you should know at least this much about them. – Kobor42 Mar 03 '14 at 13:28

score 1 · Answer 10 · edited Jan 17 '16 at 17:17

Heap Sort is a safe bet when dealing with very large inputs. Asymptotic analysis reveals order of growth of Heapsort in the worst case is Big-O(n logn), which is better than Quicksort's Big-O(n^2) as a worst case. However, Heapsort is somewhat slower in practice on most machines than a well-implemented quick sort. Heapsort is also not a stable sorting algorithm.

The reason heapsort is slower in practice than quicksort is due to the better locality of reference ("https://en.wikipedia.org/wiki/Locality_of_reference") in quicksort, where data elements are within relatively close storage locations. Systems that exhibit strong locality of reference are great candidates for performance optimization. Heap sort, however, deals with larger leaps. This makes quicksort more favorable for smaller inputs.

Quick sort isn't stable either. – Antimony Jul 24 '16 at 01:10 — Antimony, Jul 24 '16 at 01:10

score 1 · Answer 11 · answered Sep 19 '20 at 11:07

in simple terms >> HeapSort have guaranteed ~worst-case~ running time of "O(n log n)" as opposed to QuickSort’s ~average~ running time of "O(n log n)". QuickSort is usually used in practice, because typically it is faster, but HeapSort is used for external sort when you need to sort huge files that don’t fit into memory of your computer.

score -1 · Answer 12 · answered Apr 09 '16 at 21:32

To answer the original question and address some of the other comments here:

I just compared implementations of selection, quick, merge, and heap sort to see how they'd stack up against each other. The answer is that they all have their downsides.

TL;DR: Quick is the best general purpose sort (reasonably fast, stable, and mostly in-place) Personally I prefer heap sort though unless I need a stable sort.

Selection - N^2 - It's really only good for less than 20 elements or so, then it's outperformed. Unless your data is already sorted, or very, very nearly so. N^2 gets really slow really fast.

Quick, in my experience, is not actually that quick all the time. Bonuses for using quick sort as a general sort though are that it's reasonably fast and it's stable. It's also an in-place algorithm, but as it's generally implemented recursively, it will take up additional stack space. It also falls somewhere between O(n log n) and O(n^2). Timing on some sorts seem to confirm this, especially when the values fall within a tight range. It's way faster than selection sort on 10,000,000 items, but slower than merge or heap.

Merge sort is guaranteed O(n log n) since its sort is not data dependent. It just does what it does, regardless of what values you've given it. It's also stable, but very large sorts can blow out your stack if you're not careful about implementation. There are some complex in-place merge sort implementations, but generally you need another array in each level to merge your values into. If those arrays live on the stack you can run into issues.

Heap sort is max O(n log n), but in many cases is quicker, depending on how far you have to move your values up the log n deep heap. The heap can easily be implemented in-place in the original array, so it needs no additional memory, and it's iterative, so no worries about stack overflow while recursing. The huge downside to heap sort is that it is not a stable sort, which means it's right out if you need that.

Quick Sort is not a stable sort. Beyond that, questions of this nature encourage opinion-based responses and could lead to edit wars and arguments. Questions calling for opinion-based responses are explicitly discouraged by the SO guidelines. Answerers should avoid the temptation to answer them even if they have significant experience and wisdom in the are. Either flag them for closing or wait for someone with enough reputation to flag and close them. This comment is not a reflection on your knowledge or the validity of your answer. — MikeC, Apr 09 '16 at 22:04

Quicksort vs heapsort

12 Answers12

Linked

Related