4

I have a minimum heap with n elements and want to find k smallest numbers from this heap. What is the worst-case complexity?

Here is my approach: somewhere on the stackoverflow I read that complexity of finding i-th smallest number in a min heap is O(i). So if we would like to find n-1 smallest numbers (n is pointless, since it would be the entire heap), the total complexity would look something like this:

O(n-1)+O(n-2)+O(n-3)+…+O(2)+O(1)=O((1+n-1)*(n/2))=O(n^2)

Is this correct?

SantaXL
  • 618
  • 8
  • 19

3 Answers3

7

No, the time is much better than that. O(k log(n)) very easily, and O(k) if you're smart.

Finding and removing the smallest element from the heap is O(log(n)). This leads to O(k log(n)) time very easily.

BUT the result that you are thinking about is https://ac.els-cdn.com/S0890540183710308/1-s2.0-S0890540183710308-main.pdf?_tid=382a1cac-e4f7-11e7-8ac9-00000aab0f02&acdnat=1513713791_08f4df78a8821855e8ec788da063ea2f that shows how to find the size of the kth smallest number in time O(k). Now you use the fact that a heap is a binary tree and start from the root and do a recursive search for every number that you find which is smaller than that largest. Then fill out the rest of your list with copies of the k'th smallest number.

In that search you will wind up looking at up to k-1 that are at most that size, and for some of them you will look at up to 2 children that are too large to bother with, for a maximum of 3k-3 elements. This makes the whole algorithm O(k).


That link died due to bitrot. Hopefully https://www.sciencedirect.com/science/article/pii/S0890540183710308 lasts longer.

btilly
  • 43,296
  • 3
  • 59
  • 88
  • 3
    Thought admittedly, that O(k)-time algorithm for finding the kth smallest element in a min-heap is a really tricky one to actually implement! – templatetypedef Dec 19 '17 at 20:12
4

I am doubtful that it is possible to identify the kth smallest element in time O(k). The best I have seen before is an O(k log k) algorithm, which also conveniently solves your problem by identifying the k smallest elements. You can read the details in another answer on StackOverflow or on Quora.

The basic idea is to manipulate a secondary heap. Initially, this secondary heap contains only the root of the original heap. At each step, the algorithm deletes the min of the secondary heap and inserts its two original children (that is, its children from the original heap) into the secondary heap.

This algorithm has a nice property that on step i, the element it deletes from the secondary heap is the ith smallest element overall. So after k steps, the set of items which have been deleted from the secondary heap are exactly the k smallest elements. This algorithm is O(k log k) because there are O(k) deletions/insertions into a secondary heap which is upper bounded in size by O(k).


EDIT: I stand corrected! btilly's answer provides a solution in O(k) using a result from this paper.

Sam Westrick
  • 1,248
  • 1
  • 7
  • 9
  • 2
    The O(k log k) algorithm you linked to is new to me and looks very practical -- simple to implement, and probably much faster than Frederickson's optimal O(k)-time algorithm. – j_random_hacker Dec 19 '17 at 22:25
0

There is a recent (2019) algorithm that finds the k smallest elements of a binary min-heap in time O(k) that uses the soft heap data structure. This is a dramatically simpler algorithm than Frederickson’s original O(k)-time heap selection algorithm. See “ Selection from Heaps, Row-Sorted Matrices, and X+Y Using Soft Heaps” by Kaplan et al.

templatetypedef
  • 362,284
  • 104
  • 897
  • 1,065