1

I used the CHAID package from this link ..It gives me a chaid object which can be plotted..I want a decision table with each decision rule in a column instead of a decision tree. .But i dont understand how to access nodes and paths in this chaid object..Kindly help me.. I followed the procedure given in this link

i cant post my data here since it is too long.So i am posting a code which takes the sample dataset provided with chaid to perform the task.

copied from help manual of chaid:

library("CHAID")

  ### fit tree to subsample
  set.seed(290875)
  USvoteS <- USvote[sample(1:nrow(USvote), 1000),]

  ctrl <- chaid_control(minsplit = 200, minprob = 0.1)
  chaidUS <- chaid(vote3 ~ ., data = USvoteS, control = ctrl)

  print(chaidUS)
  plot(chaidUS)

Output:

Model formula:
vote3 ~ gender + ager + empstat + educr + marstat

Fitted party:
[1] root
|   [2] marstat in married
|   |   [3] educr <HS, HS, >HS: Gore (n = 311, err = 49.5%)
|   |   [4] educr in College, Post Coll: Bush (n = 249, err = 35.3%)
|   [5] marstat in widowed, divorced, never married
|   |   [6] gender in male: Gore (n = 159, err = 47.8%)
|   |   [7] gender in female
|   |   |   [8] ager in 18-24, 25-34, 35-44, 45-54: Gore (n = 127, err = 22.0%)
|   |   |   [9] ager in 55-64, 65+: Gore (n = 115, err = 40.9%)

Number of inner nodes:    4
Number of terminal nodes: 5

So my question is how to get this tree data in a decision table with each decision rule(branch/path) in a column..I dont understand how to access different tree paths from this chaid object..

sadhana
  • 157
  • 1
  • 2
  • 11
  • Please provide us with a [reproducible example](http://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example). – Roman Luštrik Feb 10 '15 at 09:53

2 Answers2

2

CHAID package uses partykit (recursive partitioning) tree structures. You can walk the tree by using party nodes - a node can be terminal or have a list of nodes with information about decision rule (split) and fitted data.

The code below walks the tree and creates the decision table. It is written for demonstration purposes and tested only on one sample tree.

tree2table <- function(party_tree) {

  df_list <- list()
  var_names <-  attr( party_tree$terms, "term.labels")
  var_levels <- lapply( party_tree$data, levels)

  walk_the_tree <- function(node, rule_branch = NULL) {
    # depth-first walk on partynode structure (recursive function)
    # decision rules are extracted for every branch
    if(missing(rule_branch)) {
      rule_branch <- setNames(data.frame(t(replicate(length(var_names), NA))), var_names)
      rule_branch <- cbind(rule_branch, nodeId = NA)
      rule_branch <- cbind(rule_branch, predict = NA)
    }
    if(is.terminal(node)) {
      rule_branch[["nodeId"]] <- node$id
      rule_branch[["predict"]] <- predict_party(party_tree, node$id) 
      df_list[[as.character(node$id)]] <<- rule_branch
    } else {
      for(i in 1:length(node)) {
        rule_branch1 <- rule_branch
        val1 <- decision_rule(node,i)
        rule_branch1[[names(val1)[1]]] <- val1
        walk_the_tree(node[i], rule_branch1)
      }
    }
  }

  decision_rule <- function(node, i) {
    # returns split decision rule in data.frame with variable name an values
    var_name <- var_names[node$split$varid[[1]]]
    values_vec <- var_levels[[var_name]][ node$split$index == i]
    values_txt <- paste(values_vec, collapse = ", ")
    return( setNames(values_txt, var_name))
  }
  # compile data frame list
  walk_the_tree(party_tree$node)
  # merge all dataframes
  res_table <- Reduce(rbind, df_list)
  return(res_table)
}

call function with the CHAID tree object:

table1 <- tree2table(chaidUS)

the result should be something like this:

gender   ager                       empstat   educr              marstat                          nodeId   predict  
-------- -------------------------- --------- ------------------ -------------------------------- -------- ---------
NA       NA                         NA        <HS, HS, >HS       married                          3        Gore     
NA       NA                         NA        College, Post Coll married                          4        Bush     
male     NA                         NA        NA                 widowed, divorced, never married 6        Gore     
female   18-24, 25-34, 35-44, 45-54 NA        NA                 widowed, divorced, never married 8        Gore     
female   55-64, 65+                 NA        NA                 widowed, divorced, never married 9        Gore
bergant
  • 7,122
  • 1
  • 20
  • 24
0

First of all thanks for this brilliant function. A little modification from my side, instead of predict_party(party_tree, node$id), to get the predicted class probabilities, try predict_party(party_tree, node$id, type = 'prob'). Also to get a specific class probability, use predict_party(party_tree, node$id, type = 'prob')[1] or predict_party(party_tree, node$id, type = 'prob')[2].

  • 1
    As it’s currently written, your answer is unclear. Please [edit] to add additional details that will help others understand how this addresses the question asked. You can find more information on how to write good answers [in the help center](/help/how-to-answer). – Community Oct 28 '21 at 04:40