0

I need to create a parent-child tree from a table of parentID and childID, where the tree may have several levels, so I need to self join the table for as many levels as the tree has, although I don't know how many levels there are. A viable solution for SQL that's not mySQL is discussed here: Simplest way to do a recursive self-join in SQL Server?, but hierarchical joins are not supported in mySQL.

Is there a way I can write a Python or R script wraps around the mySQL code to do this?

A follow up question is that I'd like to visualize the tree and make it interactive (viewer can add / delete entries), and the tree would probably be huge so maybe have a zoom feature / only see a part of a tree at a time - What's the best way to do that? I'm thinking of using the shiny package in R.

Thank you!

Community
  • 1
  • 1
md1630
  • 841
  • 1
  • 10
  • 28
  • Writing a function in R to do this would be quite easy if the entire data is in R, but if the data is too big to fit in R then calls can be made to any SQL database in a loop successively adding every generation. – Rohit Das Feb 09 '15 at 22:47
  • For the visualization bit once you have the data in a table you can pull the relevant rows and show them in a shiny app. You can edit it too, but I have found that to be a bit cumbersome with the default elements. – Rohit Das Feb 09 '15 at 22:49
  • Thanks Rohit, could you show me roughly what the code would look like in R? – md1630 Feb 09 '15 at 23:00

1 Answers1

0

Assuming the same table as in the link you mentioned. Lets say this data is in a dataframe df and you want to build the hierarchy for PersonID 1 (CJ)

PersonID | Initials | ParentID
1          CJ         NULL
2          EB         1
3          MB         1
4          SW         2
5          YT         NULL
6          IS         5

You can write a function like below which will keep looking in a loop unless the tree stops to grow. I have not tested this so might have some bug in it, but it gives a general idea.

buildTree <- function(df,id) {
tree <- df[df$PersonID == id,]
len <- nrow(tree)
while(1){
tree <- df[df$ParentID %in% tree$PersonID,]
if(nrow(tree) == len) break;
len <- nrow(tree)
}
}

tree <- buildTree(df,1)
Rohit Das
  • 1,962
  • 3
  • 14
  • 23