0

For an analogy, pretend we are a Lego company. We maintain a table of Bricks that can be used to build "designs" for Lego Sets

A design could be explained as a Tree of Nodes, where a Node could be an Assembly of Bricks or a single Brick

So for a Lego Parrot, a Tree would look like:

Parrot
├─ Head
│  ├─ Eyes
│  │  ├─ Black Brick
│  │  ├─ Black Brick
│  ├─ Beak
│  │  ├─ Black Brick
├─ Body
│  ├─ Chest
│  │  ├─ Yellow Brick
│  │  ├─ Yellow Brick
│  ├─ Wings
│  │  ├─ Blue Brick
│  │  ├─ Blue Brick
│  │  ├─ Blue Brick
├─ Tail
│  ├─ Yellow Brick
│  ├─ Blue Brick

The reason we want to lay it out this way, is because there could for example be instructions associated with each Assembly of Bricks

Now the problem we are facing is that we want to store thousands of designs in a single database, and we are worried about query efficiency if you tried to query a single design/tree. You would first need to know all of the thousands of Root Nodes, then from there traversal of the tree seems incredibly expensive in a relational table with thousands of other trees in there

Our first attempt is to store these trees in a single Relational DB (Postgres) using the Adjacency List Model (nodes just hold a reference to their parents)

But it was brought to our attention that a non-relational database might be better here, because of the built-in nesting, it seems easier to query and traverse a tree.

So what would be the best way to persist these trees?

  1. We will have tens of thousands of trees
  2. We want to be able to quickly query, traverse, so that we can build UI to quickly visualize a full Tree
  3. Users might also want to just visualize a subtree as well.
  4. We still want the ability to persist millions of Trees in the future as well
A O
  • 5,516
  • 3
  • 33
  • 68
  • look for ontologies – nbk Jul 10 '23 at 22:13
  • sorry i'm not sure what you mean, i found this thread https://stackoverflow.com/questions/2514713/what-is-an-ontology-database – A O Jul 10 '23 at 23:09
  • but that doesnt appear to have a prescription for what to do here – A O Jul 10 '23 at 23:10
  • 1
    Ask 2 System Architects you will get at least 3 answers. You may be better off building a smaller prototype of each (100s of nodes) and see which works better for you. Voting to close as an opinion. – Belayer Jul 10 '23 at 23:10
  • How many TB or even PB of data do you expect? – Frank Heikens Jul 11 '23 at 06:35
  • It mostly depends on if you want to maintain specific information on the subparts of the tree. E.g. weight of block, to calculate the total weight, or production price per block for total production price. If that is the case, go with a relational database. Even though you have posible millions of trees, that just means millions of parent tables. Subtables probably are not going to expand to wildly, since there are no millions of unique blocks. If you use non relational database it is harder to maintain unique specific data of subnodes. a couple of M records is no issue when indexed in rel db. – Bert-Jan Stroop Jul 11 '23 at 09:20
  • Thank you @Bert-JanStroop that makes a lot of sense. With the Relational DB Route, do you mean literally millions of Tables? You mentioned Parent tables and Subtables, but I was originally thinking one single table with millions of trees in it. It sounds like you are saying programmatically instantiate new tables for every tree? – A O Jul 11 '23 at 17:00
  • No. I meant 1 table for treenode, which in this case would hold records for Parrot (no parent), head (parent parrot), body (parent parrot), eyes (parent body), etc. You can then either maintain the blocks used as part of the table, but smarter would be a seperate table with pieces (row 1 (black) row 2 (yellow,) row 3 (blue) and an inbetween table with treenode X uses Y x piece Z (row 1, treenode_id of eyes, count 2, piece_id of black, etc) – Bert-Jan Stroop Jul 12 '23 at 15:59

0 Answers0