I have R code that gets one tree out of the random forest algorithm as follows:
testp.rf <- randomForest(new.df, group5, ntree = numoftree,
importance = TRUE, proximity = TRUE)
tree <- getTree(testp.rf, k=1, labelVar=TRUE)
It returns a data frame that looks like the below table screenshot. I would like to turn this into a JSON format with the correct tree hierarchy similar to the one used for this D3 example. What is the quickest way to parse the R data frame into this JSON structure.
UPDATE: I added a "parents" column to the data frame to use it as a factor, wrote the data frame to a flat JSON that looks like:
[{
"name": 1,
"parent": 0,
"left": 2,
"right": 3,
"split_var": "200862_at",
"split_point": 10.3106,
"isLeaf": 1,
"prediction": "NA"
}, /*...*/
]
then used the solution in this answer to parse it in my JS to the hierarchical form that can create a d3.layout.partition. The resulting tree structure looks something like this:
{
"name": "1",
"parent": "0",
"left": "2",
"right": "3",
"split_var": "200862_at",
"split_point": "10.3106",
"isLeaf": "1",
"prediction": "NA",
"children": [{
"name": "2",
"parent": "1",
"left": "4",
"right": "5",
"split_var": "207708_at",
"split_point": "7.212135",
"isLeaf": "1",
"prediction": "NA",
"children": [{ /*...*/
/*...*/
}
However,this is still missing important decision tree information like the rules and size of each node. The desired output should look something like the following JSON which I parsed from rpart output:
{
"name": "1",
"rule": "",
"children": [{
"name": "2",
"rule": "200862_at < 10.3106",
"children": [{
"name": "3",
"rule": "200862_at < 10.3106 & 207708_at < 7.212135",
"children": [{
"name": "4",
/*...*/
}]
I am doing the tree generation in R and parsing in JS but I'm open to solutions in any of the two.