How to do minibatch gradient descent in sklearn?

Question

Is it possible to perform minibatch gradient descent in sklearn for logistic regression? I know there is LogisticRegression model and SGDClassifier (which can use log loss function). However, LogisticRegression is fitted on whole dataset and SGDClassifier is fitted sample-by-sample (feel free to correct that statement, but this is how I understand stochastic gradient descent).

There is also partial_fit method, but that is available only for SGD. I believe that if I use partial_fit for SGD it will update weights each time it goes over next dataset sample (just like normal fit method). So if I provide chunk of 10 samples to partial_fit it does 10 updates - but that is not what I want.

What I need to get, is to update weights after each nth sample, just like in minibatch gradient descent. From what I read about LogisticRegression it can use something called warm_start which means, that weights from previous fit method are set as initial for current fit.

If this theory about warm_start is true, could I just use fit method multiple times, each time only on one minibatch? Or is there any other way to do minibatch gradient descent in sklearn?

I found this question which is very similar, except it does not discuss the warm_start idea, so that is why I asked again.

The clear (and AFAIK correct) conclusion of the linked thread is "*There seems to be no mechanism in sklearn to do [mini] batch gradient descend*", and `warm_start` actually does not alter that fact (again, the fitting will be per sample); I'm closing this as a duplicate. — desertnaut, May 27 '20 at 16:13
@desertnaut thats not correct I believe if you use `warm_start` with `LogisticRegression` (not with SGD!) I thought that if you do fit for each batch of data separatedly, then you get minibatch gradient descent, but I am not sure. However, you closed my question, so I will probably at this solution to questions you linked here as duplicates — Jan Musil, May 27 '20 at 16:20
`warm_start` in LR does not even use GD; you may use it to simulate a "batch-like" learning with the solvers used in LR (except for `liblinear`), but this will no longer be GD, which is what your question is about. If this is not exactly what you meant to ask, you are always welcome to open a new question (you are not anyhow "penalized" for questions closed as duplicates). — desertnaut, May 27 '20 at 16:26

How to do minibatch gradient descent in sklearn?

0 Answers0