I'm new statistics and I'm trying to do a step-wise multiple regression with categorical predictor using the train() in the caret package. But I don't think I'm doing it correctly. Here is my code:
#Stepwise multiple regression
set.seed(123)
# Set up repeated k-fold cross-validation
train.control <- trainControl(method = "cv", number = 10)
# Train the model
step.model <- train(Rebreeding_Score ~., data = dfp1,
method = "leapBackward",
tuneGrid = data.frame(nvmax = 1:5),
trControl = train.control
)
step.model$results
step.model$bestTune
summary(step.model$finalModel)
coef(step.model$finalModel, 5)
The function seems to select specific categories within the predictor rather than the predictor as a whole. I hope I'm explaining this correctly...
Ideally the multiple regression model should look like this.
Rfinal <- lm(Rebreeding_Score ~ Cohort + mating_group, data = dfp1, na.action = na.omit)
summary(Rfinal)
Any help would be greatly appreciated. Thank you.