There are some major issues with statsmodels' logistic regression. I have worked with many people to resolve the issue but we have not achieved any luck. For that reason, I would like to switch to Scikit Learn. The issue that I am facing with Scikit Learn is that (based on my understanding) linear and logistic regression with this library require the partitioning of testing and training data. This is indicative of ML. But, I am not trying to build an ML model. I simply would like to get p values and coefficients from multivariate logistic regression. Notice, I have many independent variables that are used to predict a single dependent variable. In statsmodels, this is very straightforward and easy.
I would like to do the same in Scikit Learn.
The following is my statsmodels code that I would like to translate over to Scikit Learn:
import pandas as pd
import statsmodels.formula.api as smf
df = pd.DataFrame({'x1': [10, 11, 0, 14],
'x2': [12, 0, 1, 24],
'x3': [0, 65, 3, 2],
'x4': [0, 0, 1, 0]})
model = smf.logit(formula='x4 ~ x1 + x2 + x3', data=df).fit()
print(model)