0

I am doing some multiple linear regression with the following code:

import statsmodels.formula.api as sm    
df = pd.DataFrame({"A":Output['10'],
                   "B":Input['Var1'],
                   "G":Input['Var2'],
                   "I":Input['Var3'],
                   "J":Input['Var4'],
res = sm.ols(formula="A ~ B + G + I + J", data=df).fit()
print(res.summary())

With the following result:

                            OLS Regression Results                            
==============================================================================
Dep. Variable:                      A   R-squared:                       0.562
Model:                            OLS   Adj. R-squared:                  0.562
Method:                 Least Squares   F-statistic:                     2235.
Date:                Tue, 06 Nov 2018   Prob (F-statistic):               0.00
Time:                        09:48:20   Log-Likelihood:                -21233.
No. Observations:                6961   AIC:                         4.248e+04
Df Residuals:                    6956   BIC:                         4.251e+04
Df Model:                           4                                         
Covariance Type:            nonrobust                                         
==============================================================================
                 coef    std err          t      P>|t|      [0.025      0.975]
------------------------------------------------------------------------------
Intercept     21.8504      0.448     48.760      0.000      20.972      22.729
B              1.8353      0.022     84.172      0.000       1.793       1.878
G              0.0032      0.004      0.742      0.458      -0.005       0.012
I             -0.0210      0.009     -2.224      0.026      -0.039      -0.002
J              0.6677      0.061     10.868      0.000       0.547       0.788
==============================================================================
Omnibus:                     2152.474   Durbin-Watson:                   0.308
Prob(Omnibus):                  0.000   Jarque-Bera (JB):             5077.082
Skew:                          -1.773   Prob(JB):                         0.00
Kurtosis:                       5.221   Cond. No.                         555.
==============================================================================

Warnings:
[1] Standard Errors assume that the covariance matrix of the errors is correctly specified.

However, my Output dataframe consists of multiple columns from 1 to 149. Is there a way to loop over all the 149 columns in the Output dataframe and in the end show the best and worst fits on for example R-squared? Or get the largest coef for variable B?

Bollehenk
  • 285
  • 2
  • 19
  • Could you provide some input data, reproducible code and expected output? – Franco Piccolo Nov 06 '18 at 10:23
  • Similar [concept](https://stackoverflow.com/questions/52970916/python-ols-statsmodels-t-stats-of-variables-not-entered-into-the-model/52976888#52976888) that I think may be able to help. – jtweeder Nov 06 '18 at 15:53

0 Answers0