0

I am training multiple 'treebag' models in R. I loop through a data set, where each iteration I define a specific subset based on a feature in the set and train on that subset. I could save each result to disk, but I was hoping to save all the models to a single data frame or data table. I am not sure if this is at all possible. The data frame/table could have numerous classes (numeric and character), however I would like to add a completed model.

To start, is it even possible to assign multiple models to a single column, where each model is assigned to a different row in a data frame or data table?

Any ideas on how this could work is greatly appreciated.

zx8754
  • 52,746
  • 12
  • 114
  • 209
Wes Sauder
  • 39
  • 6
  • You can assign output to a data frame. You can the model object to a list, possibly to a column in a data frame. You have a variety of options. When you say save all models, do you mean the model object (fit) or the model's output (fit$finalModel)? – Ryan Morton May 12 '17 at 18:38
  • you should learn about lists – B Williams May 12 '17 at 18:38
  • Thanks - I have multiple models to a list before, but I was thinking of assigning it directly to the data frame instead. – Wes Sauder May 12 '17 at 18:40
  • Take a look here: https://stackoverflow.com/questions/9547518/create-a-data-frame-where-a-column-is-a-list – Ryan Morton May 12 '17 at 18:40
  • Also, I was thinking of adding all the components of the finished model object, but in reality I may only need the one component. Based on Ryan's comment - is that fit$finalModel? Could the rest be stripped away and the model remains functional? – Wes Sauder May 12 '17 at 18:41
  • I think the model must retain the whole object to be used later. If you were doing a regression model, you could extract the coefficients from the model and use those to construct your model elsewhere - I don't think RF could work that way. – Ryan Morton May 12 '17 at 19:25
  • Thanks Ryan,I think I am just going to add additional fields to the model to store the data and then save the models to disk, labelling them with a different name. I wanted to attach a few other reference data points such as length of time model ran, feature importance using varImp() and confusion matrix. It appears I can just attach this data to the model. Appreciate the help here. – Wes Sauder May 12 '17 at 19:44

0 Answers0