I know there exists a technique to bulk load sorted data in a B+ tree.
However, I read at some places that there are 2 ways how bulk-loading can be approached - top-down and bottom-up. Resource (1) mentions that in the top-down approach, most of the internal nodes are sparse/half-full and none of the old entries go into a newer node. Whereas, in the bottom-up approach we can circumvent the sparsity.
Now, resource (2) discusses this bulk-loading technique in its own words, yet it is similar to resource (1). However, resource (2) proceeds to visualize how this bulk-loading can be implemented and ends up with a B+ tree with half-full nodes.
My question is
- Which of the resource is correct?
- How exactly is a top-down build different from a bottom-up build when considered for bulk-loading?
- Am I reading all of it wrong and bulk-loading is implemented with a bottom-up approach only? (Resource (2) says that the bottom-up approach is implemented as a part of bulk-load utility)
Resources:
- https://db-coder.github.io/DBInternalsReport.pdf
- https://slideplayer.com/slide/15127631/
- https://en.wikipedia.org/wiki/B-tree#Initial_construction
Note: Resource (2) is built with reference from the book Database Management Systems, (3rd edition), by Raghu Ramakrishnan and Johannes Gehrke. McGraw Hill, 2003.
I have given a thorough reading to the above-mentioned resources and other similar content online. Well, the only thing I did not do yet is asking ChatGPT.