I'm using caret to train a gbm model in R. I've used the formula interface to exclude certain variables from my model:
gbmTune <- train(Outcome ~ . - VarA - VarB - VarC, data = train,
method = "gbm",
metric = "ROC",
tuneGrid = gbmGrid,
trControl = cvCtrl,
verbose = FALSE)
When I try to use predict() against my test set, R complains about new factor levels for a variable I've asked to be excluded. The only solution I've been able to come up with is to set those variables to NULL before training my model...remove them. That doesn't seem like the answer.
I'm fairly new at this, so I would love to know what I'm doing wrong!