I have a number of trees, when printing they are 7 pages long. I've had to rebalance the data and need to look at the branches with the highest frequency to see if they make sense - I need to identify a cancellation rate for different clusters.
Given the data is so long what I need is to have the biggest branches and then I can validate those rather than go through 210 branches manually. I will have lots of trees so need to automate this to look at the important results.
Example code to use:
library(CHAID)
updatecars<-mtcars
updatecars$cyl<-as.factor(updatecars$cyl)
updatecars$vs<-as.factor(updatecars$vs)
updatecars$am<-as.factor(updatecars$am)
updatecars$gear<-as.factor(updatecars$gear)
plot(carsChaid)
carsChaid<-chaid(am~ cyl+vs+gear, data=updatecars)
carsChaid
When you print this data, you see n=15 for the first group. I need a table where I can sort on this value.
What I need is a decision tree table with the variable values and the number within each group from the tree. This is not exactly the same as this answer Walk a tree as it doesn't give the number within but I think it's in the direction.
Can someone help,
Thanks,
James