2

I am trying to learn Python stuff, and was wondering if someone can help me with some examples of using a priority queue. I know in java how they work, but cant seem to see clear examples of how they work in Python. Like getting the size of the queue I saw is .qsize() from here: http://docs.python.org/2/library/queue.html but it doesn't show examples of getting the min, max, organizing, popping, adding into the queue, sorting them, iterating through them. If someone can give me an example of these or point me in the right direction of where to learn this I'd really appreciate it.

user12074577
  • 1,371
  • 8
  • 25
  • 30

3 Answers3

6

Queue is not the priority queue you are looking for. Instead you are looking for heapq, which works on Python lists. You can either use an empty list ([]) or heapify an existing one, which will be shown:

Heaps are binary trees for which every parent node has a value less than or equal to any of its children. This implementation uses arrays for which heap[k] <= heap[2*k+1] and heap[k] <= heap[2*k+2] for all k, counting elements from zero.

This property is known as the heap invariant. One thing you should note, is that a sorted list is already valid as a heap in this case (It can easily be observed why - since every item is less than it's right neighbour it must be the case). Also heaps are balanced, it is guaranteed that there will be a maximum difference of 1 between the height of all leaf nodes from the root, this will be proved later. If your list is not sorted, or not already a heap, you have to call heapq.heapify to obtain a valid heap.

>>> import heapq
>>> L = [2, 3, 1, 9, 10]
>>> heapq.heapify(L)
>>> L
[1, 3, 2, 9, 10]

As a heap, this now looks like

          1
        /   \
      3      2
    /   \
  9      10

This operation works in linear time (O(N) time). As you can see, the smallest item is always at the 0th index of the list. which makes retrieving it simple:

>>> L[0]
1

This is not the case for the largest item. In that case you have to use heapq.nlargest with an n=1 items:

>>> heapq.nlargest(1, L)
[10]

To pop an item from the heap you can use heapq.heappop() which will pop the smallest item off the heap.

>>> heapq.heappop(L)
1

The code that does this looks like:

def heappop(heap):
    """Pop the smallest item off the heap, maintaining the heap invariant."""
    lastelt = heap.pop()    # raises appropriate IndexError if heap is empty
    if heap:
        returnitem = heap[0]
        heap[0] = lastelt
        _siftup(heap, 0)
    else:
        returnitem = lastelt
    return returnitem

What this does is returns the 0th element. But first it swaps the last item in the heap (heap.pop() will remove heap[-1] from the list) with the first one (heap[0] = lastelt). has to call the hidden function _siftup in the heapq module to maintain the heap invariant. For scientific purposes, here is the code that does that(heapq uses the faster C code implementation but it had a Python implementation before that which is semantically equivalent):

def _siftup(heap, pos):
    endpos = len(heap)
    startpos = pos
    newitem = heap[pos]
    # Bubble up the smaller child until hitting a leaf.
    childpos = 2*pos + 1    # leftmost child position
    while childpos < endpos:
        # Set childpos to index of smaller child.
        rightpos = childpos + 1
        if rightpos < endpos and not cmp_lt(heap[childpos], heap[rightpos]):
            childpos = rightpos
        # Move the smaller child up.
        heap[pos] = heap[childpos]
        pos = childpos
        childpos = 2*pos + 1
    # The leaf at pos is empty now.  Put newitem there, and bubble it up
    # to its final resting place (by sifting its parents down).
    heap[pos] = newitem
    _siftdown(heap, startpos, pos)

It basically keeps swapping the parent with it's smallest child. This operation takes O(log n) time, log n being the height of a balanced binary tree. To add an item to the heap use:

>>> heapq.heappush(L, 7)
>>> L
[3, 7, 10, 9]

Which takes O(log n) time. It firsts appends the item to the end of the list, since this adds a new node to the list at a height of, let's say, h, the index of that new node will equal to 2*k + 1 or 2*k + 2, let's assume(2*k + 1) of some node at index k with height h - 1. Therefore if you append another item the index will be 2*k + 2, the right child of that same node, then the next index you add will be the left child of another node to the right of the last one at height h - 1, meaning that the tree is always balanced. If you keep adding eventually you will fill that row and create a new one, etc.

Anyway then it calls the _siftdown hidden function in the heapq module which looks like this:

def _siftdown(heap, startpos, pos):
    newitem = heap[pos]
    # Follow the path to the root, moving parents down until finding a place
    # newitem fits.
    while pos > startpos:
        parentpos = (pos - 1) >> 1
        parent = heap[parentpos]
        if cmp_lt(newitem, parent):
            heap[pos] = parent
            pos = parentpos
            continue
        break
    heap[pos] = newitem

which basically continuously checks if the child is less than it's parent, if it is, then it swaps the two. As it is a balanced binary tree, the height of the tree will be O(log n), log n being amount of parents that potentially need to be swapped, to maintain the heap invariant.

If you want to sort a heapq, you can do that in optimal sorting time O(N log N) with a heapsort, by just repeatedly calling heappop on the queue:

[heappop(L) for i in range(len(L))]

Of course Python's builtin sorted is much more optimized than this code and will perform faster, although if you weren't using Python and just say implemented a heap in C, you would do that. Finally iterating through a heap is easy, just iterate through your list:

for x in L:
    print x

This won't be in any desirable order though I presume.

jamylak
  • 128,818
  • 30
  • 231
  • 230
  • 1
    beautiful explanation, only allowed to do +1. Perhaps you may want to shed light on a use case for priority queues. – iruvar Apr 20 '13 at 02:36
5

The queues in the Queue module are primarily geared toward implementation of the producer/consumer pattern of multithreading. If you're interested in general-purpose priority queue data structure, use the heapq module.

user4815162342
  • 141,790
  • 18
  • 296
  • 355
-1

You could use the PriorityQueue. Here is the example.

In [1]: from Queue import PriorityQueue
In [2]: pq = PriorityQueue()
In [3]: pq.put((1, "girl"))    # put data in the form of tuple (priority, data)
In [4]: pq.put((3, "forest"))
In [5]: pq.put((2, "rain"))
In [6]: pq.get()               # retrieves data. lowest priority number first.
Out[6]: (1, 'girl')
In [7]: pq.empty()             # checks if empty
Out[7]: False
In [8]: pq.full()              # checks if full
Out[8]: False
In [9]: pq.qsize()             # current size of the queue.
Out[9]: 2

Furthermore, you will be able to extend PriorityQueue class and customize things.

thavan
  • 2,409
  • 24
  • 32
  • `PriorityQueue` is a class not a module. Additionally, as other posters have suggested, `heapq` is a better fit for the OP's needs. – iruvar Apr 20 '13 at 03:04
  • @ravoori thanks. That was a mistake. Hope you read the last line of my answer. – thavan Apr 20 '13 at 03:26
  • The OP explicitly requested "getting the min, max, organizing, popping, adding into the queue, sorting them, iterating through them", operations that `Queue.PriorityQueue` objects in reality don't export. So, they are the wrong tool for the OP's job. – user4815162342 Apr 20 '13 at 07:18