0

Here is the I code used to build a decision tree model while using a scatterplot for visualization. I was trying to plot the decision boundary but I got an error message that I have pasted after the code. I am uncertain whether setting the species as a function of all other variables is causing this error to rise. I would appreciate if anybody had any recommendations I can follow to plot a proper decision boundary.

   #load data
    data(iris)

    #set a seed for randomness reproducable
    set.seed(42)

    #randomnly sample 100 - 150 row indexes
    indexes <- sample(
      x = 1:150, 
      size = 100
    )

    #create a training set from indexes
    train <- iris[indexes,]

    #load decision tree package
    library(tree)

    #train a decision tree model
    model <- tree(Species ~ .,train)

    #visualize
    plot(model)
    text(model)

    #load color palette
    library(RColorBrewer)

    #create a scatterplot colored by species
    palette <- brewer.pal(3, "Set2")
    plot(
      x = iris$Sepal.Length,
      y = iris$Petal.Width,
      pch = 19,
      col = palette[as.numeric(iris$Species)],
      main = "Length vs Width",
      xlab = "Length",
      ylab = "Width")

    #plot the decision boundaries
    partition.tree(
      tree = model,
      label = "Species",
      add = TRUE)

Here is the error I get:-

Error in partition.tree(tree = model, label = "Species", add = TRUE) : 
  tree can only have one or two predictors

p.s this is the Rstudio version I installed in my computer: Version 1.2.5033

sppradha
  • 11
  • 2
  • 2
    `tree` and others are not base R functions. When using functions that are not base R functions please start the scripts with a call to `library(pkgname)` in order to load the packages needed. – Rui Barradas Apr 17 '20 at 18:56
  • Is `train` a subset of built-in data set `iris`? – Rui Barradas Apr 17 '20 at 18:57
  • I did load the package first and ran the R functions. And yes train stands for training data, a subset for built-in data set iris. – sppradha Apr 17 '20 at 19:12
  • 2
    Please see this: https://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example. A good question doesn't make the reader guess at which packages were used or where the data set came from. – Dave2e Apr 17 '20 at 19:20
  • Thanks for this Dave! I am new here so its good to know the guidelines for posting a good question :) – sppradha Apr 17 '20 at 20:59
  • Since you have been kindly requested to do so already from the first comment here, you should have **edited & updated** the question accordingly by now. – desertnaut Apr 17 '20 at 21:42
  • The function partition.tree only allows for a model with 2 predictors.. In the vignette, they did something like this, partition.tree(snip.tree(model, nodes = c(12, 7))) , which means they take out nodes 12 and 7 and its descendant leaving only the part of the tree with petal.length and petal.width – StupidWolf Apr 17 '20 at 22:00
  • Check whether this is something you would like to do, if there is a different question to this, you can expand on it, and include examples – StupidWolf Apr 17 '20 at 22:01
  • 1
    @StupidWolf Thank you! I tried this and it worked. I appreciate your help again! – sppradha Apr 22 '20 at 13:35

0 Answers0