75

The Heap Sort sorting algorithm seems to have a worst case complexity of O(nlogn), and uses O(1) space for the sorting operation.

This seems better than most sorting algorithms. Then, why wouldn't one use Heap Sort always as a sorting algorithm (and why do folks use sorting mechanisms like Merge sort or Quick sort)?

Also, I have seen people use the term 'instability' with Heap sort. What does that imply?

Saket
  • 45,521
  • 12
  • 59
  • 79
  • 2
    see [Quicksort superiority over Heap Sort](http://stackoverflow.com/questions/1853208), [When is each sorting algorithm used?](http://stackoverflow.com/questions/1933759) – Nick Dandoulakis Nov 29 '11 at 13:10
  • 1
    i agree to the similarity with the above question, but not with the duplicity. it has more to deal with comparison of just heap and quick sort, and not a generic analysis of heap sort (its pros and cons, if any) – Saket Nov 29 '11 at 13:10
  • 1
    Lets not forget about stable sort requirements in some cases (heap and quicksort are normaly unstable - mergesort is normaly stable) – hugomg Nov 29 '11 at 13:11
  • I still firmly believe that this information can be easily found using pretty much any search engine (hint: use google). – ScarletAmaranth Nov 29 '11 at 13:11
  • i did find some comments/answers over the web, but i haven't got a convincing and clear answer yet. Hence, SO! – Saket Nov 29 '11 at 13:12
  • Yes but explaining *why* quick sort is usually preferred to heap sort does answer your question. what else do you want? – Jean Logeart Nov 29 '11 at 13:13
  • @Vakimshaar, at the very basic, am not just dealing with QS, but also MS (and all others). – Saket Nov 29 '11 at 15:16

5 Answers5

125

A stable sort maintains the relative order of items that have the same key. For example, imagine your data set contains records with an employee id and a name. The initial order is:

1, Jim
2, George
3, Jim
4, Sally
5, George

You want to sort by name. A stable sort will arrange the items in this order:

2, George
5, George
1, Jim
3, Jim
4, Sally

Note that the duplicate records for "George" are in the same relative order as they were in the initial list. Same with the two "Jim" records.

An unstable sort might arrange the items like this:

5, George
2, George
1, Jim
3, Jim
4, Sally

Heapsort is not stable because operations on the heap can change the relative order of equal items. Not all Quicksort implementations are stable. It depends on how you implement the partitioning.

Although Heapsort has a worst case complexity of O(n log(n)), that doesn't tell the whole story. In real-world implementation, there are constant factors that the theoretical analysis doesn't take into account. In the case of Heapsort vs. Quicksort, it turns out that there are ways (median of 5, for example) to make Quicksort's worst cases very rare indeed. Also, maintaining a heap is not free.

Given an array with a normal distribution, Quicksort and Heapsort will both run in O(n log(n)). But Quicksort will execute faster because its constant factors are smaller than the constant factors for Heapsort. To put it simply, partitioning is faster than maintaining the heap.

Jim Mischel
  • 131,090
  • 20
  • 188
  • 351
  • 13
    add to the gains which quick sort gets from making better use of the cache (locality of reference) which heap sort does not get to gain from atimes. – cobie Feb 03 '13 at 20:36
  • Quicksort and mergesort are both easier to parallelize than heapsort. – Chinasaur Sep 01 '13 at 06:52
  • I don't find stability of any use unless I need the same order in my solution. Stability doesn't in any way makes one solution better than other in terms of complexity. – Sanjeev Kumar Dangi Sep 22 '13 at 07:57
  • Now why would somebody downvote this answer? The OP asked what stability means in terms of heapsort. Did this question fail to answer that? Or did I make a mistake somewhere in my explanation? – Jim Mischel Feb 25 '20 at 18:30
12

The Heap Sort has a worst case complexity of O(n log(n)). Yet empirical studies show that generally Quick Sort (and other sorting algorithms) is considerably faster than heap sort, although its worst case complexity is O(n²) : http://www.cs.auckland.ac.nz/~jmor159/PLDS210/qsort3.html

Also, from the quick sort article on Wikipedia:

The most direct competitor of quicksort is heapsort. Heapsort's worst-case running time is always O(n log n). But, heapsort is assumed to be on average somewhat slower than standard in-place quicksort. This is still debated and in research, with some publications indicating the opposite.[13][14] Introsort is a variant of quicksort that switches to heapsort when a bad case is detected to avoid quicksort's worst-case running time. If it is known in advance that heapsort is going to be necessary, using it directly will be faster than waiting for introsort to switch to it.

However, quick sort should never be used in applications which require a guarantee of response time!

Source on Stackoverflow: Quicksort vs heapsort

Community
  • 1
  • 1
Jean Logeart
  • 52,687
  • 11
  • 83
  • 118
  • i did find such similar comments/answer over the web, but i haven't got a convincing and clear answer yet. – Saket Nov 29 '11 at 13:08
  • Quicksort has a better caching behaviour, that is why it is emprical faster. – Thomas Jungblut Nov 29 '11 at 13:39
  • when you compare these 2 sorts, I assume heap sort's o(1) space has been taken into the consideration? Could someone explain about the space as well? – Allan C May 30 '22 at 19:47
9

There is no silver bullet...

Just to mention another argument I haven't seen here yet:

If your dataset is really huge and doesn't fit into memory, then merge sort works like a charm. It's frequently used in clusters where dataset can span over hundreds of machines.

Karoly Horvath
  • 94,607
  • 11
  • 117
  • 176
0

Stable sorting algorithms maintain the relative order of records with equal keys

Some applications like having that kind of stability, most don't care, for examples Google is your friend.

As for you assertion that "folks use sorting mechanisms like Merge sort or Quick sort" I would bet that most folks use whatever is built into their language and don't think about the sorting algorithm all that much. Those that roll their own have probably not heard of heap sort (the last is personal experience).

The last and biggest reason is that not everyone is going to want a sorted heap. Some people want the sorted list. If average Joe Programmer's boss says "sort this list", and Joe says "Here's this heap data structure you've never heard of, boss!", Joe's next performance review is not going to be so great.

Kane
  • 4,047
  • 2
  • 24
  • 33
  • When it comes to sorting algorithms instability means that you don't change the order of things that compare the same. It a wholly different thing. – hugomg Nov 29 '11 at 13:43
  • Well, I was totally wrong on that one. I've edited my answer to correct that. – Kane Nov 29 '11 at 13:59
  • 1
    Heapsort, like Quicksort, is an in-place sort. The array is rearranged into a heap, and then rearranged again in sorted order. Using Heapsort will not result in a "heap data structure you've never heard of." See http://en.wikipedia.org/wiki/Heapsort – Jim Mischel Nov 29 '11 at 14:12
  • Although if you start with a linked list and want to end up with a linked list, using HeapSort ain't the best way around things, I imagine. – Vatine Nov 29 '11 at 14:21
  • @Vatine: But you wouldn't want to use Quicksort on a linked list, either. Mergesort to the rescue. – Jim Mischel Nov 29 '11 at 15:57
  • @JimMischel: True, that. – Vatine Nov 30 '11 at 10:30
0

When I worked for a short time on Tandem Non-Stop computers in the mid-80s I noted that the system in-core sort routine was HeapSort, precisely because it gave guaranteed NlogN performance. I don't know of anybody who had any reason to use it, though, so I don't know how it worked in practice. I like heapsort, but as well as the drawbacks noted above I have heard it said that it makes poor use of modern memories, because it makes memory accesses all over the place, whereas quicksort and even small radix sorts end up intermixing a relatively small number of streams of sequential reads and writes - so caches are more effective.

mcdowella
  • 19,301
  • 2
  • 19
  • 25