16

I'm looking for a good implementation for logistic regression (not regularized) in Python. I'm looking for a package that can also get weights for each vector. Can anyone suggest a good implementation / package? Thanks!

user5497
  • 243
  • 1
  • 2
  • 10
  • pssible duplicate of http://stackoverflow.com/questions/3754051/python-or-sql-logistic-regression – Mansuro Sep 22 '11 at 10:13
  • 1
    Nothing relevant in this post, I've also tried using scipy, but couldn't find any use of weights... – user5497 Sep 22 '11 at 10:16

5 Answers5

27

I notice that this question is quite old now but hopefully this can help someone. With sklearn, you can use the SGDClassifier class to create a logistic regression model by simply passing in 'log' as the loss:

sklearn.linear_model.SGDClassifier(loss='log', ...).

This class implements weighted samples in the fit() function:

classifier.fit(X, Y, sample_weight=weights)

where weights is a an array containing the sample weights that must be (obviously) the same length as the number of data points in X.

See http://scikit-learn.org/dev/modules/generated/sklearn.linear_model.SGDClassifier.html for full documentation.

akxlr
  • 1,142
  • 9
  • 23
William Darling
  • 446
  • 5
  • 6
  • 4
    supported by Olivier Grisel https://twitter.com/ogrisel/status/476367379413610497 – r0u1i Jun 10 '14 at 14:22
  • 1
    This uses one-vs-rest for multiclass problems and doesn't look like it supports the `multi_class='multinomial'` option in `LogisticRegression` – akxlr Aug 27 '15 at 14:17
7

The “balanced” mode uses the values of y to automatically adjust weights inversely proportional to class frequencies in the input data as n_samples / (n_classes * np.bincount(y))

from sklearn.linear_model import LogisticRegression

model = LogisticRegression(class_weight='balanced')

model = model.fit(X, y)

EDIT

Sample Weights can be added in the fit method. You just have to pass an array of n_samples. Check out documentation -

http://scikit-learn.org/stable/modules/generated/sklearn.linear_model.LogisticRegression.html#sklearn.linear_model.LogisticRegression.fit

Hope this does it...

Vivek Kalyanarangan
  • 8,951
  • 1
  • 23
  • 42
  • This refer to class imbalance, but what if we want to use separate weight for each sample? – mrgloom Mar 17 '16 at 12:07
  • Good question @mrgloom ! You can specify the weights by supplying a dict of weights instead of "balanced". Weights associated with classes in the form {class_label: weight}. If not given, all classes are supposed to have weight one. – Vivek Kalyanarangan Mar 18 '16 at 18:25
  • 2
    I need separate weight for each sample, not for each class. – mrgloom Mar 19 '16 at 13:53
  • I dont think that comes off the shelf. You may have to use your own version of cost function and gradient descent update to do that. – Vivek Kalyanarangan Mar 20 '16 at 19:02
4

I think what you want is statsmodels. It has great support for GLM and other linear methods. If you're coming from R, you'll find the syntax very familiar.

statsmodels weighted regression

getting started w/ statsmodels

Greg
  • 1,070
  • 11
  • 16
  • Will this statsmodels solution also provide the p-values for each dependent variable? – Sapiens Mar 17 '21 at 00:34
  • Seems to only have weighted linear regression, not logistic. "w is not yet supported (i.e. w=1), in the future it might be var_weights" – daknowles Apr 26 '22 at 01:38
0

Have a look at scikits.learn logistic regression implementation

ohe
  • 3,461
  • 3
  • 26
  • 50
  • `sklearn.linear_model.LogisticRegression` is a class, his `fit` method let you defined weight. – ohe Sep 23 '11 at 16:10
  • @ohe how? I have found the `fit` method, but it only accepts parameters for labels and features. Not weights. – Kent Munthe Caspersen Sep 22 '15 at 12:26
  • @KentMuntheCaspersen my answer is quiet old! At this time the `fit` method took a `class_weight` parameter. It is now located in th `__init__`. It might be what you're watching for. – ohe Sep 22 '15 at 12:42
  • @ohe That explains a lot. Thanks for coming back 4 years later. I think the question is about weighted instances for training, and not just class weights. At least, that is what I was searching for. – Kent Munthe Caspersen Sep 22 '15 at 16:48
-5

Do you know Numpy? If no, take a look also to Scipy and matplotlib.

gunzapper
  • 457
  • 7
  • 19
  • 3
    Scipy nor Numpy dot have any logistic regression implementation (or I couldn't find any...). matplotlib is mostly used for graphs, drawings, etc... – user5497 Sep 22 '11 at 10:20
  • Thanks! I saw it, however it implements L2 regularized logistic regression (and not regular logistic regression), and in addition it didin't implement weights... – user5497 Sep 22 '11 at 12:33