If you can afford a little bit of pre-processing time, then a classic solution to this problem would be to use nested sets.
CREATE TABLE node (
id INTEGER NOT NULL PRIMARY KEY,
name VARCHAR(255) NOT NULL,
parent_id INTEGER DEFAULT NULL,
seq_min INTEGER NOT NULL,
seq_max INTEGER NOT NULL
);
Assuming, that you use some kind of in-memory data structure like
class Node:
def __init__(self, id, name, children=()):
self.id = id
self.name = name
self.children = list(children)
to represent the tree nodes in memory, we can then define
def assign_sequence_numbers(root_node):
def recurse(counter, node):
node.seq_min = counter
node.seq_max = reduce(recurse, node.children, 1 + counter)
return node.seq_max
return recurse(1, root_node)
which does a pre-order traversal over the tree, which assignes two additional numbers to each node:
The seq_min
value becomes an alternate unique integer key for each node. Also, all direct children (and indirect descendants) of a node will have a seq_min
value, which is larger than the value assigned to the node itself.
The seq_max
value is simply an upper boundary for the seq_min
values of all descendant nodes.
Example: Consider the following tree
Root
|
+--- Category1
| |
| +--- Category1.1
| |
| `--- Category1.2
|
`--- Category2
Using the function defined above, we get
Root [min=1, max=6]
|
+--- Category1 [min=2, max=5]
| |
| +--- Category1.1 [min=3, max=4]
| |
| `--- Category1.2 [min=4, max=5]
|
`--- Category2 [min=5, max=6]
Observe, that for every node, the min
value is always <= the min
value of all descendants and the max
value is always >
the max value of all descendants. So, to find the descendants of Root
and that node itself (i.e., get the entire tree), we'd do:
SELECT * FROM node
WHERE seq_min >= 1 /* i.e., seq_min(Root) */
AND seq_min < 6 /* i.e., seq_max(Root) */
To get the descendants of Category1 (and that node itself)
SELECT * FROM node
WHERE seq_min >= 2 /* i.e., seq_min(Category1) */
AND seq_min < 5 /* i.e., seq_max(Category2) */
The drawback of this method is, that in principle, you will have to re-assign the seq_xxx
numbers for all nodes, if you make changes to the tree, i.e., insert new nodes or delete old ones. This is often not a problem, at least if changes to the tree structure are infrequent and the tree is sufficiently small.
As ITGenius already linked to that answer: if your database supports it, using common table expressions is another possibility, which avoids the pre-processing step and is also more resilient in the face of changes to the tree structure.
(I did not test any of the code given in this answer. You might need to apply a few fixes here and there; in particular, if your project's language of choice is not python...)