An index structure to allow range queries of a tree view as a flat list

Question

I need to present a tree structure in what is basically a list view with support for data virtualization. To do this I need to be able to query for a range of items in the tree based on their index, or position in the visible tree.

Because each node in the tree may contain millions of children, I load children only when a parent node is expanded. On the back-end side I want to cache the loaded children for each expanded node in a file. The front-end then needs to be able to query for the visible range of items as if it was a flat list however.

So for example the tree:

Root
- A
  - A1
  - A2
- B
  - B1
- C
- D

Would need to be presented as a list like [A, A1, A2, B, B1, C, D].

Collapsing a node, for example A would instead yield the result [A, B, B1, C, D].

To do this some kind of index is needed to keep track of expanded nodes and where the cached data resides.

To do this one idea was to have some sort of range map, where we map indices (or positions) to data files. So for the tree above we could have something like:

[0, 0] => (Root, 0)
[1, 2] => (A, 0)
[3, 3] => (Root, 1)
[4, 4] => (B, 0)
[5, 6] => (Root, 2)

Such an index could allow for fast querying of a range, for example to retrieve all items between index 2 and 5.

However, what would be a good data structure to store such an index? Since expanding or collapsing nodes requires changing the intervals of all successors to the node being expanded or collapsed.

Or is there a better way to structure such an index?

score 0 · Accepted Answer · edited May 23 '17 at 12:20

A balanced binary search tree can support these operations efficiently. There will be no keys. Instead, each node will be located based on its position(that is, keys are implicit). The value for each node is pointer to an item it corresponds to. We will also need an auxiliary data structure that maps a node from an original tree to a node in out balanced binary search tree(we can use a hash table, for instance). The operations can be done in the following way:

Collapsing a node: we should find a node in the search tree that corresponds to the collapsed node in the original tree. Let's assume that it has k children. Then we just need to remove the k next nodes from the search tree and the hash table. The time complexity of this operation is O(k * log n).
Expanding a node: almost the same as 1. We can locate this node in the binary search tree using the hash table and then insert k children after it. We also need to add these children to the hash table. It also requires O(k * log n) time.
Range query: all we need to do is to traverse the binary search tree visiting only those nodes that are inside the given range. A tree with implicit keys allows us to do it without visiting any nodes that are outside of the range. So the time complexity of this operation is O(size_of_the_range + log n).

I didn't find any articles on binary search trees with implicit keys, but I think this question can be helpful. I'm actually not sure if it is possible to use implicit keys with any binary search tree, but it is definitely possible for a treap.

If I understand you correctly though, this would require all nodes to be in memory? Or can this structure be efficiently implemented as a file on disk as well? Because there may a very large amount of records I cannot keep them in memory but instead need to cache them on disk. That's why I had the idea of an index, keeping the ranges only, instead of one item for each node. — DeCaf, Jan 13 '15 at 11:40
Some googling led me to something called an Ordered statistics tree, or rank tree which seems to be the structure you describe. Also an article on Counted B-trees. I will investigate these further and see if they provide what I need. — DeCaf, Jan 13 '15 at 11:48
@DeCaf Yes, unfortunately, this would be rather inefficient if it is stored on a disk. — kraskevich, Jan 13 '15 at 11:48
But B-Trees are efficient to store on disk if I'm not mistaken. So I'm hoping that maybe [this article](http://www.chiark.greenend.org.uk/~sgtatham/algorithms/cbtree.html) may be something that I could use. — DeCaf, Jan 13 '15 at 11:51

An index structure to allow range queries of a tree view as a flat list

1 Answers1