To illustrate the differences between $finalModel$predicted
and the values computed by predict()
, I set up the following code:
library(caret)
library(randomForest)
dat <- data.frame(target = c(2.5, 4.5, 6.1, 3.2, 2.2),
A = c(1.3, 4.4, 5.5, 6.7, 8.1),
B = c(44.5, 50.1, 23.7, 89.2, 10.5),
C = c("A", "A", "B", "B", "B"))
control <- trainControl(method="repeatedcv", number=10, repeats=3, search="grid", savePred =T)
tunegrid <- expand.grid(.mtry=c(1:3))
set.seed(42)
rf_gridsearch <- train(target ~ A + B + C,
data = dat,
method="rf",
ntree = 2500,
metric= "RMSE",
tuneGrid=tunegrid,
trControl=control)
dat$pred_caret <- rf_gridsearch$finalModel$predicted
dat$pred <- predict(object = rf_gridsearch, newdata = dat[,2:4])
dat$pred2 <- predict(object = rf_gridsearch$finalModel, newdata = dat[,2:4])
This last line of this code gives the error message
Error in predict.randomForest(object = rf_gridsearch$finalModel,
newdata = dat[, : variables in the training data missing in newdata
How is it possible to use $finalModel
with predict?
Why does the data in column dat$pred_caret
differ from dat$pred
? What is the difference between the 2 predictions?