7

From the Soft Heap wikipedia page, it seems that the min extraction only takes constant time, thus using a soft heap to perform heapsort should lead to an amortized O(n). Even if the constant is large, for very large n this algorithm should be very useful. But I've never heard people mentioning this. Is there a reason that people do not use this?

Thanks!

templatetypedef
  • 362,284
  • 104
  • 897
  • 1,065
xuhdev
  • 8,018
  • 2
  • 41
  • 69

2 Answers2

6

The Soft Heap suffers from "corruption" (read the page you link to), which makes it inapplicable as a component of a general-purpose sorting routine. You will simply get the wrong answer most of the time.

If you have some application that requires a sort but could deal with the "corrupted" results you would get from a Soft Heap as part of the implementation, then this would give you a potential speedup.

The Fibonacci Heap does not suffer from "corruption," however it has a O(log n) delete time. You can use Fibonacci Heap as part of a general-purpose sorting routine, but the overall performance of your sort will be O(n log n).

Rob
  • 6,247
  • 2
  • 25
  • 33
4

To follow up on @Rob's point:

There is a theoretical limit on the efficiency of comparison-based sorting algorithms, of which heapsort is one. No comparison-based sort can do have runtime better than Ω(n log n) in the average case. Since heapsort is Θ(n log n), this means that it's asymptotically optimal and there can't be an O(n) average-time variant (at least, not a comparison-based one). The proof of this argument comes from information theory - without doing at least Ω(n log n) comparisons, there is no way to reliably distinguish the input permutation from any of the other input permutations.

The soft heap was invented by starting with a binomial heap and corrupting some fraction of the keys such that inserting and dequeuing n elements from a soft heap does not necessarily sort them. (The original paper on soft heaps mentions in its abstract that the ingenuity of the structure is artificially decreasing the "entropy" of the values stored to beat the Ω(n log n) barrier). This is the reason why the soft heap can support O(1)-time operations: unlike a normal heap structure, it doesn't always sort, and therefore is not bound by the runtime barrier given above. Consequently, the very fact that n objects can be enqueued and dequeued from a soft heap in O(n) time immediately indicates that you cannot use a soft heap to speed up heapsort.

More generally, there is no way to use any comparison-based data structure to build a sorting algorithm unless you do at least Ω(n log n) work on average when using that data structure. For example, this earlier question explains why you can't convert a binary heap to a BST in O(n) time, since doing so would let you sort in O(n) time purely using comparisons (build the heap in O(n) time, then convert to a BST in O(n) time, then do an inorder traversal in O(n) time to recover the sorted sequence).

Hope this helps!

Community
  • 1
  • 1
templatetypedef
  • 362,284
  • 104
  • 897
  • 1,065
  • 1
    But should it be Ω (n lg n) in your first paragraph? I mean the barrier. – xuhdev Jun 10 '13 at 23:04
  • @xuhdev- I believe so. Ω(n log n) means "asymptotically at least n log n," and the barrier is that every comparison-based sorting algorithm must make at least Ω(n log n) comparisons in the average case. It wouldn't be correct to say "at least O(n log n) comparisons," since big-O notation gives an *upper bound*, while Ω gives a *lower bound*. Does that make sense? – templatetypedef Jun 10 '13 at 23:08
  • Yeah it makes sense, I just got confused at first. Thanks – xuhdev Jun 11 '13 at 03:51
  • I wonder if the result of running heapsort with a soft heap is 'nearly sorted' or just random. If the former, softheap sort + an O(N) best case sort like smoothsort might be an interesting hybrid. – AShelly Jun 04 '14 at 02:29
  • @AShelly: According to the Wikipedia article: "They can also be used to easily build an optimal selection algorithm, *as well as near-sorting algorithms,*" – Jim Mischel Jun 04 '14 at 13:03