How to save and snapshot machine learning model during a single training?

Question

I have to run my model on a cluster which has a time limitation of 7 days, and if the computational cost exceeds the 7 days, the job will be ended on the cluster. Thus, the training will not be completed to obtain a saved model for prediction.

I am training some classifier models (such as SVC, KNeighborsClassifier, and etc.) from scikit-learn and wondering if there is any function or library for snapshotting the model between specific duration and then continue training from the point that has been stopped (similar to what is being applied in deep learning)?

Thanks

https://scikit-learn.org/stable/modules/computing.html – PV8 Aug 13 '19 at 05:49 — PV8, Aug 13 '19 at 05:49

score 2 · Accepted Answer · answered Aug 13 '19 at 07:02

In general, taking snapshots during fitting is not possible in scikit-learn. The library offers only limited persistence features. It is possible to incrementally train some models but not others. For the models where it is possible you will have to write varying amounts of boiler-plate code.

The models listed under incremental learning have a warm_start attribute and/or a .partial_fit() method for this purpose. You can call partial_fit in a loop over batches of data. Furthermore, you need to write the code to store and retrieve training progress and the partially trained model (see this question for additional information on model model persistence).

Some models (especially ensembles, like Random Forests) can in principle be merged. So instead of incrementally training one model you train multiple independent model instances in a loop and merge them afterwards. However, the scikit-learn API does not support such merging as far as I know. So while it is possible to do so it requires hacking private attributes and in-depth knowledge of the model's math and implementation.

Thank you, is not incremental learning is available for few limited classifiers (according to the list)? For example, this method cannot be performed on SVC model, right? — S.EB, Aug 15 '19 at 12:48
@S.EB exactly. SVC is a problem; it does not support incremental learning and I don't think there is a way to merge independently trained SVCs. As a rule of thumb: only models that use iterative training (e.g. gradient descent) can be trained incrementally. — MB-F, Aug 15 '19 at 14:05

How to save and snapshot machine learning model during a single training?

1 Answers1