Context
I am using caret
to fit and tune models. Typically, the best parameters are found using a resampling method such as cross-validation. Once the best parameters are chosen, a final model is fitted to the whole training data using the best set of parameters.
In addition to the parameters to tune (passed via tuneGrid
), one can pass arguments to the underlying algorithm being called by passing them to train
.
My question
Is there any way to specify model-specific options to be used for the final model only?
For extra clarity: I do want to fit all the intermediate models (to obtain a reliable performance estimate) but I want to fit the final model with different arguments (in addition to the best parameters).
Specific use case
Let's say I want to fit a bartMachine
to some data and then use the final model in production. I would typically save the tuned model to disk and load it as needed. But I can only save/load a bartMachine model that has been serialized, i.e. I need to pass serialize=T
to bartMachine
via caret::train
.
But that will serialize all the models which is very impractical. I really only need to serialize the final model. Is there any way to do that?
library("caret")
library("bartMachine")
tgrid <- expand.grid(num_trees = 100,
k = c(2, 3),
alpha = 0.95,
beta = 2,
nu = 3)
# The printed log shows that all intermediate models are being serialized
fit <- train(hp ~ .,
data=mtcars,
method="bartMachine",
serialize=T,
tuneGrid=tgrid,
trControl = trainControl(method="cv", 5, verboseIter=T))