6

Given that heapq in python is min heap as specified in python doc, assume that I have a heapq with m elements, what is the time complexity of calling nlargest? I don't think the complexity is O(n*lg(m)) because simply popping the root and heapify again in a min heap only get you nsmallest?

How does heapq.nlargest work?

Community
  • 1
  • 1
Gary
  • 1,828
  • 1
  • 20
  • 27

2 Answers2

9

You can see the code here. Suppose you do a heapq.nlargest(n, it), where it is an iterable with m elements. It first constructs a min heap with the first n elements of it. Then, for the rest m-n elements, if they are bigger than the root, it takes the root out, puts the new element, and shifts it down. At the end, the complexity is O(log(n) * m).

matiasg
  • 1,927
  • 2
  • 24
  • 37
0

For heapq.nlargest(k, n), the complexity would be n + k*logn.

First heapify the iterable in n time, then pop the topmost element (given this is a max heap), and sift up the the heap until position 0. This will take log n time. We will repeat this k times. So the complexity here becomes k * logn.

If k is large, the complexity is decided by the latter part klogn, however, for smaller values of k, the complexity is decided by the former O(n). E.g. O(n + 3logn) where n is large becomes O(n).