I want to store hierarchical data in the PostgreSQL database. I found ltree extension but it is used to store tree-structured data i.e there can be only one parent node. Is there any way I can tweak it to store multiple parent nodes?
Asked
Active
Viewed 2,276 times
3
-
A two-column table of parent_id, child_id can represent it, but will not prevent cycles. Whether that's a useful representation depends on how large the dataset is and what you need to do with it. – AdamKG Jun 02 '20 at 16:02
-
@AdamKG the data will not have cycle and there can be millions of nodes. But, the maximum children for a node can be around 100k. – semicolon Jun 03 '20 at 12:12
1 Answers
9
Yes, it is possible to do so using the ltree extension, by using two tables: one for your nodes and another for your paths. Bustawin has a great writeup on how to work with DAGs using this extension with this strategy.
For a node to have multiple parents, it is necessary that we define more than one ltree
path for such node, one that leads from each parent. Say we have a DAG that forms a diamond between nodes A, B, C, and D:
A -> B
B -> D
A -> C
C -> D
Using the extension, you should have a node D related to two paths, A.B.D
and A.C.D
. Both queries should return node D
SELECT * FROM nodes
JOIN paths ON paths.node_id = nodes.id
WHERE paths.ltree_path ~ "A.B.*{1}"
SELECT * FROM nodes
JOIN paths ON paths.node_id = nodes.id
WHERE paths.ltree_path ~ "A.C.*{1}"

Magmagan
- 168
- 2
- 7