0

Some databases (like MySQL) use B+trees as a way to physically organize records. The primary index (also called clustering index) stores the keys as internal nodes and the records in the leaf level.

Since all pages have fixed size, I suppose that internal nodes and leaf nodes have different degrees: The degree of internal nodes is derived from the primary key size, and the degree of the leaf nodes is derived from the record size.

What if the record size is variable? How the degree of the leaf level is set?

Additionally, I always study b+trees under the assumption that keys have the same size. It enables a straightforward method for splitting and dividing the nodes. What about the leaf level? How to balance the nodes when the records to move have variable sizes?

  • I think your question might need to be more specific. When the page size is fixed, splitting is performed based on the size of the records (in bytes), not the number of records. Is that all you're asking? – boneill Dec 23 '22 at 05:19
  • As far as I know, b+tree splitting is performed based on the maximum degree m. The degree is set to a number to maximize the number of entries. It depends on the size of the key. Small keys lead to a higher m. But that only works when keys have a fixed size. Am I getting it wrong? – Sérgio Mergen Dec 23 '22 at 10:17
  • B-trees which use fixed sized pages don't split based on a maximum degree, but instead when the page cannot hold any more records. If the records themselves are too big, then the page is made larger using so-called overflow pages. – boneill Dec 23 '22 at 15:16
  • Thanks. That is is what I imagine would happen. How about the second question? I suppose that the merging and redistribution routines should be adapted to support objects with variable size. For instance, for normal implementations, it suffices to borrow one single object from a neighbor to assure the page remains half full. If objects have variable size, more objects may need to be moved to reach the minimum page size. Is that correct? – Sérgio Mergen Dec 23 '22 at 21:35
  • The split doesn't need to be perfectly balanced. In some cases, it's advantageous to not do this. See my answer to this question: https://stackoverflow.com/questions/73498429/btree-splitting-leads-to-leaf-nodes-with-less-capacity – boneill Dec 23 '22 at 23:48

0 Answers0