I am trying to prune a decision tree to create 19 trees that have 2-20 terminal nodes, and I would like to calculate the training and test error for each. I used this code:
range <- c(2:20)
for (i in range) {
prune.fit <- prune.tree(fit, best = i)
plot(prune.fit) # all the plots :)
text(prune.fit, pretty = 0)
}
which worked well to generate the trees, but when I added in the training and test error it wouldn't work. I then tried this:
for (i in range) {
pred.fittrain[i] <- predict(prune.fit[i], newdata = my_ahp_train)
mean((pred.fittrain - my_ahp_train$sale_price)^2)
pred.fittest[i] <- predict(prune.fit[i], newdata = my_ahp_test)
mean((pred.fittest - my_ahp_test$sale_price)^2)
}
but it just gave me an error. I don't know how to fix this so that it calculates for each individual tree. If anyone has any tips please let me know!
For the Training and Test Error calculation I tried the following codes:
range <- c(2:20)
for (i in range) {
prune.fit <- prune.tree(fit, best = i)
plot(prune.fit) # all the plots :)
text(prune.fit, pretty = 0)
pred.fittrain[i] <- predict(prune.fit[i], newdata = my_ahp_train)
mean((pred.fittrain - my_ahp_train$sale_price)^2)
pred.fittest[i] <- predict(prune.fit[i], newdata = my_ahp_test)
mean((pred.fittest - my_ahp_test$sale_price)^2)
}
AND
range <- c(2:20)
for (i in range) {
prune.fit <- prune.tree(fit, best = i)
plot(prune.fit) # all the plots :)
text(prune.fit, pretty = 0)
pred.fittrain <- predict(prune.fit, newdata = my_ahp_train)
mean((pred.fittrain - my_ahp_train$sale_price)^2)
pred.fittest <- predict(prune.fit, newdata = my_ahp_test)
mean((pred.fittest - my_ahp_test$sale_price)^2)
}
AND
for (i in range) {
pred.fittrain[i] <- predict(prune.fit[i], newdata = my_ahp_train)
mean((pred.fittrain - my_ahp_train$sale_price)^2)
pred.fittest[i] <- predict(prune.fit[i], newdata = my_ahp_test)
mean((pred.fittest - my_ahp_test$sale_price)^2)
}
I was expecting one of these to generate the training and test errors for each decision tree.