Sklearn: Linear Regression, Model Analysis. statsmodels.api Summay Equivalent

Question

import statsmodels.api as sm
from scipy import stats


X2 = sm.add_constant(toTrainX)
est = sm.OLS(toTrainY, X2)
est2 = est.fit()
print(est2.summary())

This would give me a Holistic Picture of the model, Like

            coef       std    err          t      P>|t|        [0.025      0.975]
Intercept   3.2        0.01   0.02        21311    0.000          3.1        3.3
X1         13.2        0.01   0.02        21311    0.000         13.1       13.3
X2         33.2        0.11   0.12        12       0.400         13.1       213.3
--  --  --         ---   ---   ---          ---  --   ---   ---   ---     ---    ---
------------------------------------------------------------------------------------------ 
Omnibus:                     764.278   Durbin-Watson:                   2.013
Prob(Omnibus):                  0.000   Jarque-Bera (JB):             8512.556
Skew:                           0.185   Prob(JB):                         0.00
Kurtosis:                       2.878   Cond. No.                     1.22e+17

How could I get the same for my SkLean LinearRegression Model

LinearRegression().fit(X_train, y_train)
#linreg.coef_ & linreg.intercept_ 
# Are Also not matching with est2.summary()

score 1 · Answer 1 · answered May 17 '22 at 21:18

1

Based on the responses to this post it is not immediately possible with base Sklearn. However, an answer references this file for extending the LinearRegression class to compute t stats and p-values for each of coefficients.

As for why they differ, this post attempts to explain why some difference in results is to be expected.

answered May 17 '22 at 21:18

Jack Haas

66
4

Also , from sklearn.feature_selection import f_regression f_statistic, p_value = f_regression(X_train, y_train) f_statistic, p_value – user2458922 May 17 '22 at 21:35

Sklearn: Linear Regression, Model Analysis. statsmodels.api Summay Equivalent

1 Answers1