3

The current sklearn LogisticRegression supports the multinomial setting but only allows for an l2 regularization since the solvers l-bfgs-b and newton-cg only support that. Andrew Ng has a paper that discusses why l2 regularization shouldn't be used with l-bfgs-b.

If I were to use sklearn's SGDClassifier with log loss and l1 penalty, would that be the same as multinomial logistic regression with l1 regularization minimized by stochastic gradient descent? If not, are there any open source python packages that support l1 regularized loss for multinomial logistic regression?

user1879926
  • 1,283
  • 3
  • 14
  • 24

2 Answers2

3

According to the SGD documentation:

For multi-class classification, a “one versus all” approach is used.

So I think using SGDClassifier cannot perform multinomial logistic regression either.


You can use statsmodels.discrete.discrete_model.MNLogit, which has a method fit_regularized which supports L1 regularization.

The example below is modified from this example:

import numpy as np
import statsmodels.api as sm
from sklearn.datasets import load_iris
from sklearn.cross_validation import train_test_split

iris = load_iris()
X = iris.data
y = iris.target
X = sm.add_constant(X, prepend=False) # An interecept is not included by default and should be added by the user.
X_train, X_test, y_train, y_test = train_test_split(X, y)

mlogit_mod = sm.MNLogit(y_train, X_train)

alpha = 1 * np.ones((mlogit_mod.K, mlogit_mod.J - 1)) # The regularization parameter alpha should be a scalar or have the same shape as as results.params
alpha[-1, :] = 0 # Choose not to regularize the constant

mlogit_l1_res = mlogit_mod.fit_regularized(method='l1', alpha=alpha)
y_pred = np.argmax(mlogit_l1_res.predict(X_test), 1)

Admittedly, the interface of this library is not as easy to use as scikit-learn, but it provides more advanced stuff in statistics.

yangjie
  • 6,619
  • 1
  • 33
  • 40
  • Thanks! However, I tried to split into the train and test set. Fitting the model with l1 regularization caused several problems which [I found to be addressed here as well](http://stackoverflow.com/questions/31507396/mnlogit-in-statsmodel-returning-nan) After looking around some more, I found this package called lighting that does basically what sklearn does but more. [For example, there is multinomial support for l1 regularization via SGD.](http://www.mblondel.org/lightning/generated/lightning.classification.SGDClassifier.html#lightning.classification.SGDClassifier) – user1879926 Aug 04 '15 at 04:44
  • I forgot to pass `alpha` parameter so it was actually fitting without regularization. Obviously, logistic regression without regularization cannot work in linear separable case. BTW, the package you found is really a good solution. – yangjie Aug 18 '15 at 17:03