1

6 Child tree

In the above tree each node has a name and value. Each node can have 6 children at max. How to store it in MySQL database to perform the below operations efficiently?

Operations

1) grandValue(node) - should give (sum of all of the descendants' values, including self)

Eg.,

  • grandValue(C) = 300
  • grandValue(I) = 950
  • grandValue(A) = 3100

2) children(node) - should give the list of all children (immediate descendants only)

Eg.,

  • children(C) = null
  • children(I) = L,M,N
  • children(A) = B,C,D,E

3) family(node) - should give the list of descendants

  • family(C) = null
  • family(I) = L,M,N
  • family(A) = B,C,D,E,F,G,H,I,J,K,L,M,N

4) parent(node) - should give the parent of the node

  • parent(C) = A
  • parent(I) = D
  • parent(A) = null

5) insert(parent, node, value) - should insert node as a child of parent

  • insert(C, X, 500) Insert a node name X with value 500 as C's child

I am thinking of using recursive methods to do these manipulations as we do with binary trees. But I am not sure if that's the optimal way to do it. The tree may hold 10 to 30 million nodes and maybe skewed. So dumping the data into memory stack is my area of concern.

Please help.

NOTE: I am using PHP, MySQL, Laravel, on VPS Machine.

UPDATE: Tree will grow in size. New nodes will be added as a child of leaf nodes or nodes which has less than 6 nodes and not in-between 2 nodes.

brainless
  • 5,698
  • 16
  • 59
  • 82
  • 1
    http://stackoverflow.com/questions/935098/database-structure-for-tree-data-structure – gontrollez Oct 16 '14 at 13:42
  • @gontrollez I already gone through the question before posting as new question. And also I did some googling on this topic. But I want to know the optimal solution for my exact problem. Most of the documents are very generic. – brainless Oct 16 '14 at 13:48
  • I think the Nested set model is what you need. – gontrollez Oct 16 '14 at 14:16
  • This may be useful: [MPTT is a fast algorithm for storing hierarchical data: a PHP class providing an implementation of the modified preorder tree traversal algorithm (nested set)](http://stefangabos.ro/php-libraries/zebra-mptt/). – Ryan Vincent Oct 16 '14 at 14:52
  • I used https://github.com/etrepat/baum, a beautiful Laravel package available on packagist https://packagist.org/packages/baum/baum. – brainless Oct 25 '14 at 08:19

1 Answers1

5

You could store the data in a table using nested sets.
http://en.wikipedia.org/wiki/Nested_set_model#Example
I worry that your millions of nodes may make life difficult if you intend to constantly add new items. Perhaps that concern could be mitigated by using rational numbers instead of integers as the left and right values. Add a column for depth to speed up your desire to ask for descendants. I wrote some SQL to create the table and the stored procedures you asked for. I did it in SQL Server do the syntax might be slightly different but it's all standard SQL statements being executed. Also I just manually decided the upper and lower bounds for each Node. Obviously you'd have to deal with writing the code to get these nodes inserted (and maintained) in your database.

CREATE TABLE Tree(
    Node nvarchar(10) NOT NULL,
    Value int NOT NULL,
    L int NOT NULL,
    R int NOT NULL,
    Depth int NOT NULL,
);

INSERT INTO Tree (Node, Value, L, R, Depth) VALUES ('A', 100,  1, 28, 0);
INSERT INTO Tree (Node, Value, L, R, Depth) VALUES ('B', 100,  2,  3, 1);
INSERT INTO Tree (Node, Value, L, R, Depth) VALUES ('C', 300,  4,  5, 1);
INSERT INTO Tree (Node, Value, L, R, Depth) VALUES ('D', 150,  6, 25, 1);
INSERT INTO Tree (Node, Value, L, R, Depth) VALUES ('E', 200, 26, 27, 1);
INSERT INTO Tree (Node, Value, L, R, Depth) VALUES ('F', 400,  7,  8, 2);
INSERT INTO Tree (Node, Value, L, R, Depth) VALUES ('G', 250,  9, 10, 2);
INSERT INTO Tree (Node, Value, L, R, Depth) VALUES ('H', 500, 11, 12, 2);
INSERT INTO Tree (Node, Value, L, R, Depth) VALUES ('I', 350, 13, 21, 2);
INSERT INTO Tree (Node, Value, L, R, Depth) VALUES ('J', 100, 21, 22, 2);
INSERT INTO Tree (Node, Value, L, R, Depth) VALUES ('K',  50, 23, 24, 2);
INSERT INTO Tree (Node, Value, L, R, Depth) VALUES ('L', 100, 14, 15, 3);
INSERT INTO Tree (Node, Value, L, R, Depth) VALUES ('M', 300, 16, 17, 3);
INSERT INTO Tree (Node, Value, L, R, Depth) VALUES ('N', 200, 18, 19, 3);

CREATE PROCEDURE grandValue
    @Node NVARCHAR(10)
AS
BEGIN
    SET NOCOUNT ON;
    DECLARE @lbound INT;
    DECLARE @ubound INT;
    SELECT @lbound = L, @ubound = R FROM Tree WHERE Node = @Node
    SELECT SUM(Value) AS Total FROM TREE WHERE L >= @lbound AND R <= @ubound
    RETURN
END;

EXECUTE grandValue 'C';
EXECUTE grandValue 'I';
EXECUTE grandValue 'A';

CREATE PROCEDURE children
    @Node NVARCHAR(10)
AS
BEGIN
    SET NOCOUNT ON;
    DECLARE @lbound INT;
    DECLARE @ubound INT;
    DECLARE @depth INT;
    SELECT @lbound = L, @ubound = R, @depth=Depth FROM Tree WHERE Node = @Node
    SELECT Node FROM TREE WHERE L > @lbound AND R < @ubound AND Depth = (@depth + 1)
    RETURN
END;

EXECUTE children 'C';
EXECUTE children 'I';
EXECUTE children 'A';

CREATE PROCEDURE family
    @Node NVARCHAR(10)
AS
BEGIN
    SET NOCOUNT ON;
    DECLARE @lbound INT;
    DECLARE @ubound INT;
    SELECT @lbound = L, @ubound = R FROM Tree WHERE Node = @Node
    SELECT Node FROM TREE WHERE L > @lbound AND R < @ubound
    RETURN
END;

EXECUTE family 'C';
EXECUTE family 'I';
EXECUTE family 'A';

CREATE PROCEDURE parent
    @Node NVARCHAR(10)
AS
BEGIN
    SET NOCOUNT ON;
    DECLARE @lbound INT;
    DECLARE @ubound INT;
    DECLARE @depth INT;
    SELECT @lbound = L, @ubound = R, @depth = Depth FROM Tree WHERE Node = @Node
    SELECT Node FROM TREE WHERE L < @lbound AND R > @ubound AND Depth = (@depth - 1)
    RETURN
END;

EXECUTE parent 'C';
EXECUTE parent 'I';
EXECUTE parent 'A';

CREATE PROCEDURE ancestor
    @Node NVARCHAR(10)
AS
BEGIN
    SET NOCOUNT ON;
    DECLARE @lbound INT;
    DECLARE @ubound INT;
    SELECT @lbound = L, @ubound = R FROM Tree WHERE Node = @Node
    SELECT Node FROM TREE WHERE L < @lbound AND R > @ubound
    RETURN
END;

EXECUTE ancestor 'C';
EXECUTE ancestor 'I';
EXECUTE ancestor 'A';

For creating the nested sets in the table in the first place you can run some code to generate the inserts or start with the first node and then successively add each additional node - although since each add potentially modifies many of the nodes in the set there can be a lot of thrashing of the database as you build this.

Here's a stored procedure for adding a node as a child of another node:

CREATE PROCEDURE insertNode
    @ParentNode NVARCHAR(10), @NewNodeName NVARCHAR(10), @NewNodeValue INT
AS
BEGIN
    SET NOCOUNT ON;
    DECLARE @ubound INT;
    DECLARE @depth INT;
    SELECT @ubound = R, @depth = Depth FROM Tree WHERE Node = @ParentNode
    UPDATE Tree SET L = L + 2 WHERE L >= @ubound
    UPDATE Tree SET R = R + 2 WHERE R >= @ubound
    INSERT INTO Tree (Node, Value, L, R, Depth) VALUES (@NewNodeName, @NewNodeValue,  @ubound, @ubound + 1, @depth + 1);
    RETURN
END;

I got this from http://www.evanpetersen.com/item/nested-sets.html who also shows a nice graph walking algorithm for creating the initial L and R values. You'd have to enhance this to keep track of the depth as well but that's be easy.

Brett
  • 256
  • 1
  • 8
  • (+1) I already saw similar techniques to store hierarchical data, I haven't the time to test it, but, seems promizing optimal. I still would add a int key field, not the text description, & int parent field, even if I use the `depth` and `left` and `right` fields ... – umlcat Oct 16 '14 at 16:43
  • @brett Thanks! How to insert into nested set model? Stored procedure or php which one is better for inserting? – brainless Oct 17 '14 at 08:36
  • 1
    I'd use a stored procedure. There's a nice article on it at: [link](http://www.evanpetersen.com/item/nested-sets.html) but basically there's some work to get the tree initially set up although I suppose you could just start with the root node and perform a bunch of adds. I'll post another comment with a stored procedure for adding a node in it. – Brett Oct 17 '14 at 12:44