8

I am reading the Pairing heap.

It is quite simple, the only tricky part is the delete_min operation.

The only non-trivial fundamental operation is the deletion of the minimum element from the heap. The standard strategy first merges the subheaps in pairs (this is the step that gave this datastructure its name) from left to right and then merges the resulting list of heaps from right to left:

I don't think I need copy/paste the code here, as it is in the wiki link.

My questions are

  1. why they do this two pass merging?

  2. Why they first merge pairs? not directly merge them all?

  3. also why after merging pairs, merge specifically from right to left?

Jackson Tale
  • 25,428
  • 34
  • 149
  • 271
  • 1
    The original paper, [The Pairing Heap: A New Form of Self-Adjusting Heap](http://www.cs.cmu.edu/afs/cs.cmu.edu/user/sleator/www/papers/pairing-heaps.pdf) explains it quite well. – Jim Mischel Mar 18 '14 at 13:29

1 Answers1

15

With pairing heap, adding an item to the heap is an O(1) operation because all it does is add the node either as the new root (if it's smaller than the current root), or as the first child of the current root. So if you created a pairing heap and added the numbers 0 through 9 to it, in order, you would end up with:

        0
        |
-----------------
| | | | | | | | |
9 8 7 6 5 4 3 2 1

If you then do a delete-min, you then have to look at each child to determine the minimum item and build the new heap. If you use the naive left to right combining method, you end up with this tree:

       1
       |
---------------
| | | | | | | |
9 8 7 6 5 4 3 2

And the next time you do a delete-min you have to look at the 8 remaining children, etc. Using this technique, creating and then removing all items from the heap would be an O(n^2) operation.

The two-pass method of combining in pairs and then combining the pairs results in a much more efficient structure. Consider the first case. After deleting the minimum item, we're left with the nine children. They're combined in pairs from left to right to produce:

  8  6  4  2  1
 /  /  /  /
9  7  5  3

Then we combine the the pairs right to left. In steps:

  8  6  4    1
 /  /  /    /
9  7  5    2
          /
         3

  8  6    1
 /  /    / \
9  7    2   4
       /   / 
      3   5  

  8     1
 /      |
9   ---------
    6   4   2
   /   /   /
  7   5   3

      1
      |
  ----------
  8  6  4  2
 /  /  /  /
9  7  5  3

Now, the next time we call delete-min, there are only four nodes to check, and the next time after that there will only be two. Using the two-pass combining method reduces the number of nodes at the child level by at least half. The arrangement I showed is the worst case. If the items were in ascending order, the first delete-min operation would result in a tree with only two child nodes below the root.

This is a particularly good example of the amortized complexity of pairing heap. insert is O(1), but the first delete-min after a bunch of insert operations is O(n), where n is the number of items that were inserted since the last delete-min. The beauty of the two-pass combining rule is that it quickly reorganizes the heap to reduce that O(n) complexity.

With this combining rule, the amortized complexity of delete-min is O(log n). With the strict left-to-right rule, it's O(n).

Jim Mischel
  • 131,090
  • 20
  • 188
  • 351
  • Why is it important that the second phase is right to left? It seems to me to be exactly equivalent to left to right in the sense there is always an insert order that can produce the same tree structure no matter the direction. In your example that would produce a binary tree largely imbalanced to the left. – Celelibi Dec 26 '15 at 10:29
  • @Celelibi: Most likely the reverse order is an effect of the recursive nature of the combining process. The [original paper](https://www.cs.cmu.edu/~sleator/papers/pairing-heaps.pdf) mentions the possibility of doing two left-to-right passes, but doesn't say anything about the performance. – Jim Mischel Dec 26 '15 at 15:09
  • If you use a left-child, right-sibling linked-list to represent the heap as a tree, then going left to right is easier for the first pass. If you use an intermediate list or queue to hold the results of the first pass, then the second pass could proceed in either order, but I am guessing that going right to left removes bias from the resulting tree and evens things out. – Paul Chernoch Oct 17 '17 at 18:54
  • @PaulChernoch: After experimenting a bit, I don't see any performance difference in performance with right-left vs. left-right on the second pass. I'm pretty sure that the right-left described in the original paper is an artifact of their recursive definition of the problem. Basically, they just do the combining while unwinding the stack. I moved to an iterative implementation because the recursive implementation blows the stack when you have a large heap. And, as you say, left child right sibling representation made two left-to-right passes preferable. – Jim Mischel Oct 17 '17 at 19:20
  • Isn't it moderately easy to construct a tree which exhibits similar worst-case behavior? Or am I missing something important about the first pass? – Lucretiel Jul 10 '18 at 03:04
  • @Lucretiel But the pairing heap *is* a tree. It's just not an n-ary tree. n-ary heap (where n is 2 or 3) has average case insertion of O(1), but worst-case of O(log n). Other node-based heaps, such as Fibonacci heap, Skew heap, Binomial heap, etc. have similar asymptotic behavior, but higher constants; they're slower in real-world situations. – Jim Mischel Jul 10 '18 at 04:28
  • You can do both passes at once with a loop very straightforwardly, by doing two steps on each iteration, and merging the merged pairs into an accumulator variable (or in a high level language, using a fold). Merging them the other way on the way back looks harder to me, since you would be accumulating the nodes on the stack. – saolof Jun 14 '21 at 22:15