13

I use heapq module in python, I find I only can the min heap, even if I use reverse=True

I still get the min top heap

from heapq import *

h=[]
merge(h,key=lambda e:e[0],reverse=True)
heappush(h, (200, 1))
heappush(h, (300,2))
heappush(h, (400,3))
print(heappop(h))

I still get the result:

(200, 1)

I want to get result:

(400,3)

how to do it?

which is the smallest element. I want pop the largest emelment?

ps: this is part of question find the max and then divide into several elements and then put it back to the heap.

Stephen Rauch
  • 47,830
  • 31
  • 106
  • 135
user504909
  • 9,119
  • 12
  • 60
  • 109

2 Answers2

17

The documentation says,

Our pop method returns the smallest item, not the largest (called a “min heap” in textbooks; a “max heap” is more common in texts because of its suitability for in-place sorting).

So you cannot get a max heap directly. However, one way to get it indirectly is to push the negative of the item on the heap, then take the negative again just after you pop the item. So instead of heappush(h, (200, 1)) you execute heappush(h, (-200, -1)). And to pop and print the max item, execute

negmaxitem = heappop(h)
maxitem = (-negmaxitem[0], -negmaxitem[1])
print(maxitem)

There are other ways to get the same effect, depending on what exactly you are storing in the heap.

Note that trying h[-1] in a min-heap does not work to find the max item--the heap definition does not guarantee the max item will end up at the end of the list. nlargest should work but has time complexity of O(log(n)) to just examine the largest item in a min-heap, which defeats the purpose of the heap. My way has time complexity O(1) in the negative-heap to examine the largest item.

Rory Daulton
  • 21,934
  • 6
  • 42
  • 50
  • 1
    I don't understand what you're saying about `nlargest`. To find the largest element of the iterable you give it, it takes O(n), not O(log(n)). – Stefan Pochmann Jan 15 '18 at 02:12
  • 1
    Actually, according to the documentation, I think it's actually O(n log n): "Equivalent to: `sorted(iterable, key=key, reverse=True)[:n]`" – Niema Moshiri Jan 15 '18 at 02:45
  • 1
    @NiemaMoshiri That means same result, not same complexity. Btw, you just said that asking it for the largest n=1 elements takes O(0) time. – Stefan Pochmann Jan 15 '18 at 04:37
  • 1
    Ah, wasn't aware it only means the output is equivalent, not the time complexity. Thanks for that! Also, not sure what you mean by O(0) time. Big-O time complexity is a description of asymptotic run-time scalability as *n* approaches infinity; you don't just plug in a value of *n* – Niema Moshiri Jan 15 '18 at 04:42
  • 1
    @NiemaMoshiri I sure can plug in a value of n. Big-O complexity is a "for **all** n ≥ n₀ statement". So also for n=n₀. I can plug that. You didn't say what n₀ you had in mind, so I assumed n=1 since we're talking about finding the 1 largest value. But it doesn't really matter. Whatever your n₀ is, I can just choose n=n₀ and then O(n log n) = O(n₀ log n₀) = O(1). You're still saying it's constant time, which is wrong. – Stefan Pochmann Jan 15 '18 at 04:54
  • I would suggest reading up on Big-O. That is not how it works – Niema Moshiri Jan 15 '18 at 05:00
  • @NiemaMoshiri Nah, *you* should read up on it. Again: It's a "for all n ≥ n₀" statement. If I want to prove you wrong, I just need to show *one* n for which you're wrong (and I'm free to use n₀). If I wanted to prove you *right*, *then* I couldn't just pick one. But that's not what I'm doing. (Btw, could you please `@notify` me when you respond?) – Stefan Pochmann Jan 15 '18 at 05:16
  • @NiemaMoshiri To close the loop: The problem was that you used the same variable "n" for two different things. Yes, Rory and I had already talked about `nlargest` but meant the list size with "n" without saying so, but that's another level than your explicit usage in `[:n]`. That explicit usage in my opinion overrules the implicit definition, so your statement reads like "It finds the n largest elements in a list in O(n log n) time, regardless of list size". Which I think you'll agree is wrong. – Stefan Pochmann Jan 17 '18 at 15:40
  • Apologies, I didn't realize that the blurb I copied from the Python documentation also used `n` to denote the number of largest elements to return. All of the things I said regarding O(*n* log *n*) were with *n* denoting the *total number of elements*. Sorry about the confusion! – Niema Moshiri Jan 17 '18 at 22:04
  • @NiemaMoshiri Alright :-). To be clear, in the Big-O issue I could still pick a certain n for trying to prove you wrong, but if it means the list size then I just won't succeed. Anyway, again: Please notify people when you respond. Or is that some kind of cheap trick for trying to have the last word? – Stefan Pochmann Jan 18 '18 at 18:41
6

Why not use a PriorityQueue object? You can store (priority,key) tuples. One easy solution to make a max-heap would be to make priority be the opposite of key:

from Queue import PriorityQueue
pq = PriorityQueue()
for i in range(10): # add 0-9 with priority = -key
    pq.put((-i,i))
print(pq.get()[1]) # 9
Niema Moshiri
  • 909
  • 5
  • 14
  • pq do not have pop function – user504909 Jan 15 '18 at 02:45
  • Not sure what you mean. The `get()` function of a `PriorityQueue` is equivalent to a heap's `pop()`: it removes and returns the highest-priority element – Niema Moshiri Jan 15 '18 at 02:48
  • Sorry, I thought it would be delete the highest-priority element before. Thanks for that. – user504909 Jan 15 '18 at 03:06
  • I can use the index to get the second largest number easy – user504909 Jan 15 '18 at 03:40
  • How? In a max-heap, the root is the largest element (so you're sure largest element is in index 0), but the second largest element can be either child of the root (so it can be in either index 1 or index 2). If you're sure you *only* want the largest 2 elements at any given time, I agree that you can do it efficiently: `h[0]` for the largest and `max(h[1],h[2])` for the second largest. – Niema Moshiri Jan 15 '18 at 03:48
  • Worth noting PriorityQueue has synchronized methods, so some threading overhead. – Ryhan Apr 05 '20 at 01:01