21

I am learning Left-Lean-Red-Black tree, from Prof.Robert Sedgewick

http://www.cs.princeton.edu/~rs/talks/LLRB/LLRB.pdf http://www.cs.princeton.edu/~rs/talks/LLRB/RedBlack.pdf

While I got to understand the insert of the 2-3 tree and the LLRB, I have spent totally like 40 hours now for 2 weeks and I still can't get the deletion of the LLRB.

Can anyone really explain the deletion of LLRB to me?

Jackson Tale
  • 25,428
  • 34
  • 149
  • 271

3 Answers3

13

Ok I am going to try this, and maybe the other good people of SO can help out. You know how one way of thinking of red nodes is as indicators of

  1. where there there imbalance/new nodes in the tree, and
  2. how much imbalance there is.

This is why all new nodes are red. When the nodes (locally) balance out, they undergo a color flip, and the redness is passed up to the parent, and now the parent may look imbalanced relative to its sibling.

As an illustration, consider a situation where you are adding nodes from larger to smaller. You start with node Z which is now root and is black. You add node Y, which is red and is a left child of Z. You add a red X as a child of Z, but now you have two successive reds, so you rotate right, recolor, and you have a balanced, all black (no imbalance/"new nodes"!) tree rooted at Y [first drawing]. Now you add W and V, in that order. At first they are both red [second drawing], but immediately V/X/W are rotated right, and color flipped, so that only X is red [third drawing]. This is important: X being red indicates that left subtree of Y is unbalanced by 2 nodes, or, in other words, there are two new nodes in the left subtree. So the height of the red links is the count of new, potentially unbalanced nodes: there are 2^height of new nodes in the red subtree.

enter image description here

Note how when adding nodes, the redness is always passed up: in color flip, two red children become black (=locally balanced) while coloring their parent red. Essentially what the deletion does, is reverse this process. Just like a new node is red, we always also want to delete a red node. If the node isn't red, then we want to make it red first. This can be done by a color flip (incidentally, this is why color flip in the code on page 3 is actually color-neutral). So if the child we want to delete is black, we can make it red by color-flipping its parent. Now the child is guaranteed to be red.

The next problem to deal with is the fact that when we start the deletion we don't know if the target node to be deleted is red or not. One strategy would be to find out first. However, according to my reading of your first reference, the strategy chosen there is to ensure that the deleted node can be made red, by "pushing" a red node down in front of the search node as we are searching down the tree for the node to be deleted. This may create unnecessary red nodes that fixUp() procedure will resolve on the way back up the tree. fixUp() presumably maintains the usual LLRBT invariants: "no successive red nodes" and "no right red nodes."

Not sure if that helps, or if we need to get into more detailed examination of code.

angelatlarge
  • 4,086
  • 2
  • 19
  • 36
  • 4
    it is very helpful. `Just like a new node is red, we always also want to delete a red node. If the node isn't red, then we want to make it red first.`, this is really inspiring. – Jackson Tale Mar 18 '13 at 11:12
  • 2
    I think something wrong with your rotation. if h.left and h.left.left are both red, we do a rotateToRight on the top node, which in the case you were showing, the result should be V/W/X, and W become red after the color flip. {Actually, your subtree VXW does not fit the criterion of binary search tree } – hakunami Jan 13 '14 at 09:15
  • I'm stuck too -- I seem to be getting some mileage out of thinking of red links as parts of chains that begin with black links. So a black link with a red left represents a chain of length 2 -- 2 nodes and 3 links, or a 3-node. And a black link with a red left whose red is also left represents a chain of length 3 -- 3 nodes and 4 links, or a 4-node. Note that the condition tested when moving down the chain for deleteMin is that neither the successor of the current head is red, nor its successor is red -- in other words, neither is it part of a 3 node @ current head, nor does it start one. – Alex Leibowitz Aug 26 '19 at 20:38
  • In addition, many of my difficulties arise because I confuse the representation and what is represented: when moving recursively down the left spine, we focus on the structure of the LLRB tree, but when assessing what to do at each point in the recursion, our tests are based on what that tree represents (the algo described in the text, which is in terms of 2-3-4 trees). – Alex Leibowitz Aug 26 '19 at 20:44
0

There is an interesting comment about the Sedgwich implementation and in particular its delete method from a Harvard Comp Sci professor. Left-Leaning Red-Black Trees Considered Harmful was written in 2013 (the Sedgwich pdf you referenced above is dated 2008):

Tricky writing

Sedgewick’s paper is tricky. As of 2013, the insert section presents 2–3–4 trees as the default and describes 2–3 trees as a variant. The delete implementation, however, only works for 2–3 trees. If you implement the default variant of insert and the only variant of delete, your tree won’t work. The text doesn’t highlight the switch from 2–3–4 to 2–3: not kind.

The most recent version I could find of the Sedgwich code, which contains a 2-3 implementation, is dated April 2014. It is on his Algorithms book site at RedBlackBST.java

Community
  • 1
  • 1
Kurt Krueckeberg
  • 1,225
  • 10
  • 11
  • His implementation is wrong, just saying. Check deleteMin method. – Pavel Aug 04 '15 at 08:39
  • 2
    Just curious, did you actually look at Sedgewick's paper, which Kohler links to? I don't see how a big caption and surrounding text that says 2-3 repeatedly doesn't "highlight" the delete code is for 2-3 trees. Sedgewick even makes a comment about modifying the code for 2-3-4. I don't know where Kohler got the idea that the 2-3-4 is the main tree and 2-3 is a variant. Sedgewick clearly presents both types from the very beginning, with neither being described as the "default". Kohler seems to have an axe to grind with some of his comments there. – C S Dec 28 '15 at 18:42
0

Follow the next strategy to delete an arbitrary node in a LLRB tree which is not in a leaf:

  1. Transform a LLRB tree to a 2-3-4 tree (we do not need to transform the whole tree, only a part of the tree).
  2. Replace the value of the node (which we want to delete) its successor.
  3. Delete its successor.
  4. Fix the tree (recover balance, see the book "Algorithms 4th edition" on the pages 435, 436).

If a value in a leaf then we do not need to use a successor to replase this value, but we still need to transform the current tree to 2-3-4 tree to delete this value.

The slide on the page 20 of this presentation https://algs4.cs.princeton.edu/lectures/keynote/33BalancedSearchTrees.pdf and the book "Algorithms 4th edition" on the page 437 are a key. They show how a LLRB tree transformations into a 2-3 tree. In the book "Algorithms 4th edition" on the page 442 https://books.google.com/books?id=MTpsAQAAQBAJ&pg=PA442 is an algorithm of transformation for trees.

For example, open the page 54 of the presentation https://www.cs.princeton.edu/~rs/talks/LLRB/08Dagstuhl/RedBlack.pdf. The node H has children D and L. According to the algorithm on the page 442 we transform these three nodes into the 4-node of a 2-3-4 tree. Then the node D has children B and F we also transform these nodes into a node of 2-3-4 tree. Then the node B has children A and C we also transform these nodes into a node of 2-3-4 tree. And finally we need to delete A. After deletion we need to recover balance. We move up through the tree and we restore balance of the tree (according to rules, see the book "Algorithms 4th edition" on the pages 435, 436). If you need to delete the node D (the same tree on the page 54). You need the same transformations and need to replace the value of the node D on the value of the node E and delete the node E (because it is a successor of D).

Volodymyr
  • 1,037
  • 1
  • 11
  • 27