Converting statsmodel.api format to Scikit Learn format

Question

There are some major issues with statsmodels' logistic regression. I have worked with many people to resolve the issue but we have not achieved any luck. For that reason, I would like to switch to Scikit Learn. The issue that I am facing with Scikit Learn is that (based on my understanding) linear and logistic regression with this library require the partitioning of testing and training data. This is indicative of ML. But, I am not trying to build an ML model. I simply would like to get p values and coefficients from multivariate logistic regression. Notice, I have many independent variables that are used to predict a single dependent variable. In statsmodels, this is very straightforward and easy.

I would like to do the same in Scikit Learn.

The following is my statsmodels code that I would like to translate over to Scikit Learn:

import pandas as pd
import statsmodels.formula.api as smf

df = pd.DataFrame({'x1': [10, 11, 0, 14],
                       'x2': [12, 0, 1, 24],
                       'x3': [0, 65, 3, 2],
                       'x4': [0, 0, 1, 0]})

model = smf.logit(formula='x4 ~ x1 + x2 + x3', data=df).fit()
print(model)

you can try using code for p-values from here https://stackoverflow.com/questions/25122999/scikit-learn-how-to-check-coefficients-significance — hellpanderr, Aug 18 '19 at 00:06
I need to solve the question stated first before getting p values and coefficients. The problem is doing multivariate logistic regression using Scikit Learn. — Zakariah Siyaji, Aug 18 '19 at 00:21
scikit-learn is focused on prediction, not inference. I am not sure that you can get p-values from sklearn `LogisticRegression` models. It is relatively straight forward to run a model, though. You have to feed the function arrays. So put x1-x3 in an array and x4 into an array or use `[` within the function to subset on the fly. The package has great [documentation](https://scikit-learn.org/stable/index.html) including a search bar. — lmo, Aug 18 '19 at 15:06

Converting statsmodel.api format to Scikit Learn format

0 Answers0