3

I want to store hierarchical data in the PostgreSQL database. I found ltree extension but it is used to store tree-structured data i.e there can be only one parent node. Is there any way I can tweak it to store multiple parent nodes?

pifor
  • 7,419
  • 2
  • 8
  • 16
semicolon
  • 43
  • 5
  • A two-column table of parent_id, child_id can represent it, but will not prevent cycles. Whether that's a useful representation depends on how large the dataset is and what you need to do with it. – AdamKG Jun 02 '20 at 16:02
  • @AdamKG the data will not have cycle and there can be millions of nodes. But, the maximum children for a node can be around 100k. – semicolon Jun 03 '20 at 12:12

1 Answers1

9

Yes, it is possible to do so using the ltree extension, by using two tables: one for your nodes and another for your paths. Bustawin has a great writeup on how to work with DAGs using this extension with this strategy.

For a node to have multiple parents, it is necessary that we define more than one ltree path for such node, one that leads from each parent. Say we have a DAG that forms a diamond between nodes A, B, C, and D:

A -> B
B -> D
A -> C
C -> D

Using the extension, you should have a node D related to two paths, A.B.D and A.C.D. Both queries should return node D

SELECT * FROM nodes
JOIN paths ON paths.node_id = nodes.id
WHERE paths.ltree_path ~ "A.B.*{1}"
SELECT * FROM nodes
JOIN paths ON paths.node_id = nodes.id
WHERE paths.ltree_path ~ "A.C.*{1}"
Magmagan
  • 168
  • 2
  • 7