I was looking at reducing my size of my trained models (namely this and this post) and I have come across the trim
parameter in the caret train function. Specifically, this was added in version 6.0-47, from the documentation:
If
TRUE
the final model inobject\$finalModel
may have some components of the object removed so reduce the size of the saved object. Thepredict
method will still work, but some other features of the model may not work.trim
ing will occur only for models where this feature has been implemented.
I realize the results of using trim
may vary by method used. Is there a resource to determine what will be included and excluded from the final model after using the trim
parameter? How much space could I expect to save? What (if any) functionality is lost?
In previous questions, it is ambiguous if the parameter could even save space. For example, here is a simple example where trim=T
and trim=F
return an object of the same size using randomForests:
library(caret)
library(pryr)
# make a large dataset so iris example is not too trivial
large_iris <- iris[rep(seq_len(nrow(iris)), 10), ]
object_size(large_iris) # 1.38 MB
set.seed(1234)
mdl1 <- train(Species~.,data=large_iris,method="rf",trControl=trainControl(trim=F))
object_size(mdl1) # 1.24 MB
attributes(mdl1)
set.seed(1234)
mdl2 <- train(Species~.,data=large_iris,method="rf",trControl=trainControl(trim=T))
object_size(mdl2) # 1.24 MB
attributes(mdl2)