6

According to my calculation:

  • Quicksort cost = n + (n/2 + n/2) + (n/4 + n/4 + n/4 + n/4) + ... = n * log(n) = log(nn)
  • Heapsort cost = sum [log(i)] for i = n, n-1, n-2, ..., 1 = log(n!)

Why it is said quicksort has better constant factor than heapsort and therefore quick sort is better than heapsort in average? Isn't log(nn) > log(n!)?

phs
  • 10,687
  • 4
  • 58
  • 84
user389955
  • 9,605
  • 14
  • 56
  • 98
  • 1
    "Cost" can mean many things. Are you talking about the worst-case number of comparisons? –  Aug 26 '13 at 18:47
  • 3
    Why the mathematica tag? – arshajii Aug 26 '13 at 18:48
  • See [*Introduction to Algorithms*](http://poincare.matf.bg.ac.rs/~jelenagr//AIDA/Introduction_to_algorithms_3rd_edition.pdf) for why heapsort is O(n log n). – Oliver Charlesworth Aug 26 '13 at 18:51
  • @delnan: means time complexity – user389955 Aug 26 '13 at 22:56
  • Because heapsort has a bad rep. See http://www.azillionmonkeys.com/qed/sort.html – harold Aug 26 '13 at 23:12
  • @Oli: it is mentioned in that book: The HEAPSORT procedure takes time O(n lg n), since the call to BUILD-MAXHEAP takes time O(n) and each of the n − 1 calls to MAX-HEAPIFY takes time O(lg n). I have different opinion of the "each of the n − 1 calls to MAX-HEAPIFY takes time O(lg n)". IMO,first time when MAX-HEAPIFY() is called, heap has size n, so O(lg n) for getting the max from the heap, but next time when MAX-HEAPIFY() is called, heap has size n-1, so O(lg n-1)... so the running time for all the n call is lg(n) + log (n-1) + ...+ lg(1) = lg (n!) – user389955 Aug 26 '13 at 23:16
  • Related: http://stackoverflow.com/questions/1853208/quicksort-superiority-over-heap-sort – templatetypedef Aug 27 '13 at 00:08

2 Answers2

12

I think the issue here is that your analysis of quicksort and heapsort is not precise enough to show why the constant factors would be different.

You can indeed show that on average, quicksort will do more comparisons than heapsort (roughly 1.44 n log2 n for quicksort versus n log2 n versus heapsort). However, comparisons are not the only determining factor in the runtime of heapsort and quicksort.

The main reason quicksort is faster is locality of reference. Due to the way that memory caches work, array accesses in locations that are adjacent to one another tend to be much, much faster than array accesses scattered throughout an array. In quicksort, the partitioning step typically does all its reads and writes at the ends of the arrays, so the array accesses are closely packed toward one another. Heapsort, on the other hand, jumps around the array as it moves up and down the heap. Therefore, the array accesses in quicksort, on average, take much less time than the array accesses in heapsort. The difference is large enough that the constant factor in front of the n log n term in quicksort is lower than the constant factor in front of the n log n term in heapsort, which is one reason why quicksort is much faster than heapsort.

In short - if all we care about are comparisons, heapsort is a better choice than quicksort. But since memory systems use caches and cache misses are expensive, quicksort is usually a much better option.

Also, note that log(nn) = n log n and log (n!) = n log n - n + O(log n) via Stirling's approximation. This means that log (n!) is not much smaller than n log n, even as n gets very large. There definitely is a difference, but it's not large enough to make a huge dent on its own.

Hope this helps!

templatetypedef
  • 362,284
  • 104
  • 897
  • 1,065
  • summary for my information: 1)constant factor is determined not only from theoretical comparison but also implementation details. Qsort has small constant factor because it is cache friendly. For Qsort: the next element to be accessed is usually close in memory to the element you just looked at. For Hsort, access requires to jump up and down... – user389955 Aug 27 '13 at 06:25
  • @templatetypdef: Thank u for the very detailed and useful explanation. stackoverflow is becoming more and more popular because of the person like you. – user389955 Aug 27 '13 at 06:28
  • I believe that you got mixed up between merge sort and heap sort here. When bubbling up through the heap you have to compare the possible 2 parents with each other to find the smaller, then compare that to the one bubbling up to find whether to bubble. That is 2 comparisons for log_2 levels making it 2 n log_2(n) comparisons which is worse than quicksort. By contrast with merge sort every comparison moves something up through one of the log_2(n) levels, leading to n log_2(n) which is better than quicksort. – btilly Mar 31 '18 at 19:33
6

Here are paragraphs from Steven S. Skiena's The Algorithm Design Manual, which talking about the speed between the three O(nlogn) sorting algortihms:

But how can we compare two Θ(n log n) algorithms to decide which is faster?How can we prove that quicksort is really quick? Unfortunately, the RAM model and Big Oh analysis provide too coarse a set of tools to make that type of distinction. When faced with algorithms of the same asymptotic complexity, implementation details and system quirks such as cache performance and memory size may well prove to be the decisive factor.

What we can say is that experiments show that where a properly implemented quicksort is implemented well, it is typically 2-3 times faster than mergesort or heapsort. The primary reason is that the operations in the innermost loop are simpler. But I can’t argue with you if you don’t believe me when I say quicksort is faster. It is a question whose solution lies outside the analytical tools we are using. The best way to tell is to implement both algorithms and experiment.

-4.6.3 "Is Quicksort Really Quick?",The Algorithm Design Manual

Community
  • 1
  • 1
vvy
  • 1,963
  • 13
  • 17