5

I have trained a dataset with rf method. For example:

ctrl <- trainControl(
                     method = "LGOCV", 
                     repeats = 3, 
                     savePred=TRUE,
                     verboseIter = TRUE,
                     preProcOptions = list(thresh = 0.95)
                    )

preProcessInTrain<-c("center", "scale")
metric_used<-"Accuracy"
model <- train(
               Output ~ ., data = training,
               method = "rf",
               trControl = ctrl,
               metric=metric_used,
               tuneLength = 10,
               preProc = preProcessInTrain
              )

After thath, I want to plot the decission tree, but when I wirte plot(model), I get this: plot(model).

If I write plot(model$finalModel), I get this : plot(model$finalModel)

I would like to plot the decission tree...

How can I do that? Thanks :)

Sandipan Dey
  • 21,482
  • 2
  • 51
  • 63

1 Answers1

6

The model you are using is random forest, which is not a single decision tree, but an ensemble of a large number of trees. Plotting the final model will plot the error rates on the training and test datasets as # of trees are increased, something like the following.

enter image description here

If you want a single decision tree instead, you may like to train a CART model like the following:

model <- train(
  Species ~ ., data = training,
  method = "rpart",
  trControl = ctrl,
  metric=metric_used,
  tuneLength = 10,
  preProc = preProcessInTrain
)
library(rpart.plot)
rpart.plot(model$finalModel)

Now plotting the final model as above will plot the decision tree for you.

Sandipan Dey
  • 21,482
  • 2
  • 51
  • 63
  • 1
    Thanks a lot!!! I have one question. What means de blue, green, black and red lines? I mean, what are the differences between them? – Alonso Albaladejo Rojo Sep 22 '16 at 10:33
  • 2
    Thanks for the question. The black line represents the overall OOB error, whereas the other three colored lines represent the OOB error accross each of the 3 classes in the training data (I used training data with 3 different class labels). – Sandipan Dey Sep 22 '16 at 10:43