1

I'm a little confused with the deletion in B+ tree. I searched a lot in Google and find that there are two implementation when the key you want to delete appears in the index:

  1. Delete the key in the index
  2. Keep the key in the index

Algorithm from https://www.javatpoint.com/b-plus-tree-deletion uses the first way.

enter image description here

Algorithm from https://www.cs.princeton.edu/courses/archive/fall08/cos597A/Notes/BplusInsertDelete.pdf uses the second way.

enter image description here

So I really want to know which one is right.

But I'm more inclined to take that as an undefined behavior. At this point, could someone help me figure out the advantage and disadvantage between them? And how to choose between them?

Thanks in advance.

Ynjxsjmh
  • 28,441
  • 6
  • 34
  • 52
  • 1
    Personally, I've never worked with an implementation that kept deleted keys, and I would not want to. Think temporal data that is indexed by a time stamp. All the old keys would consume space in the data structure until it was rebuilt. Of course there could be other scenarios where a previously used key is more likely to occur on a new entry, but I would not expect a generic data structure to make oddball optimizations like that. – 500 - Internal Server Error Aug 26 '19 at 13:41
  • 1
    Either way works fine. The actual value of an internal key is not important, as long as it's > all the descendants on the left, and <= all on the right. – Matt Timmermans Aug 27 '19 at 02:40

1 Answers1

0

Both methods are correct.

The difference that you highlight is not so much in deleting/not-deleting internal keys, but in updating/not-updating them.

Obviously, when you delete a value (i.e. a key in a leaf node), the b-plus-tree property is not violated: all child values are still within the range dictated by the parent information. You can never break this range-rule by merely removing a value from a leaf. This rule is also still valid when you update the internal key(s) in the path to that leaf (according to method 1), which is only necessary when the deleted value was the left-most one in its leaf.

Note that the two methods may produce quite different trees after a long sequence of the same operations (insert, delete).

But on average the second method will have slightly less work to do. This difference is not significant though.

trincot
  • 317,000
  • 35
  • 244
  • 286