4

I'm working on a project where I need to store a Tree structure in a database, in the past I already dealt with the same scenario and I used a particular solution (explained below).

I know that there is no BEST solution, and usually the best solution is the one who gives the major advantages, but there is undoubtedly a worst one, and I'd like to not use that...

As I was saying I need to:

  • store an unbalanced tree structure
  • any node can have an "unlimited" number of children
  • have the ability to easily obtain all the children (recursively) of one node
  • have the ability to easily "rebuild" the tree structure

The solution I used in the past consisted into use a VARCHAR(X * Y) primary key where:

  • X is the "hypothetical" maximum level possible
  • Y is the character number of the "hypothetical" maximum number of direct children of one node...

i.e.

If I have: - at maximum 3 levels then X = 3
- at maximum 20 direct children per node, Y = 2 (20 has two characters - it is possible then to store up to 99 children)

The PRIMARY KEY column will be created as VARCHAR(6)

The ID is a composite combination of PARENT ID + NODE_ID

NODE ID is an incremental numerical value padded with zeros on the left side.

the node in the first level will be then stored as:
[01,02,03,04,...,99]

the nodes in the second level will be stored as:
[0101, 0102, 0103, ..., 0201, 0202, 0203, ... , 9901, 9999]

the nodes in the third level will be stored as:
[010101, 010102, 010103, ..., 020101, 020102, 020301, ... , 990101, 999999]

and so on...

PROs:

  • It's easy to rebuild the tree
  • It's super easy obtain the children list of one particular node (ie. select ... where id like '0101%')
  • Only one column for both the identifier and the parent link.

CONs:

  • It's mandatory to define a MAX number of CHILDREN/LEVELS
  • It the X and Y values are great the id key will be a way too long
  • VARCHAR type as primary key
  • Changing the tree structure (move one node from one parent to another) will be difficult (if not impossible) and consuming because of the necessity to re-create the entire ids for the node and all it's children.

Preorder Tree Traversal

I did some research and the best solution I found to my main problems (obtaining all the children of one node, etc.), is to use the Preorder Tree Traversal solution (for the sake of brevity I will post a link where the solution is explained: HERE )

Whilst this solution is better in almost every aspects, it has a HUGE downside, any change in the structure (add/remove/change parent of a node) needs to RECREATE the entire left/right indexes, and this operation is time and resource consuming.

Conclusion

Having said so, any suggestion is very much appreciated.

Which is for you the best solution to maximize the needs explained in the beginning?

zsltg
  • 725
  • 5
  • 14
  • 23
Marcx
  • 6,806
  • 5
  • 46
  • 69
  • If you need to store and traverse a tree structure, you might reconsider your use of MySQL. Most other databases (Postgres, SQL Server, Oracle, for instance) offer better support for hierarchical queries, by implementing recursive CTEs (and other methods). – Gordon Linoff Jul 22 '15 at 11:44
  • I will evaluate the possibility, although I'd prefer to remain on MySQL because this tree's element are strictly related to others entities... – Marcx Jul 22 '15 at 12:20
  • 1
    Stackoverflow already got your back on tree structures for mysql :) http://stackoverflow.com/a/20216006/2523414 – Clément Prévost Feb 21 '16 at 22:27

0 Answers0