5

I am running the OLS summary for a column of values. Part of the OLS is the Durbin-Watson and Jarque-Bera (JB) statistics and I want to pull those values out directly since they have already been calculated rather than running the steps as extra steps like I do now with durbinwatson.

Here is the code I have:

import pandas as pd
import statsmodels.api as sm

csv = mydata.csv
df = pd.read_csv(csv)
var = df[variable]
year = df['Year']
model = sm.OLS(var,year)
results = model.fit()
summary = results.summary()
print summary
#print dir(results)
residuals = results.resid
durbinwatson = statsmodels.stats.stattools.durbin_watson(residuals, axis=0)
print durbinwatson

Results:

                           OLS Regression Results                            
==============================================================================
Dep. Variable:                    LST   R-squared:                       1.000
Model:                            OLS   Adj. R-squared:                  1.000
Method:                 Least Squares   F-statistic:                 3.026e+05
Date:                Fri, 10 Nov 2017   Prob (F-statistic):           2.07e-63
Time:                        20:37:03   Log-Likelihood:                -82.016
No. Observations:                  32   AIC:                             166.0
Df Residuals:                      31   BIC:                             167.5
Df Model:                           1                                         
Covariance Type:            nonrobust                                         
==============================================================================
                 coef    std err          t      P>|t|      [0.025      0.975]
------------------------------------------------------------------------------
Year           0.1551      0.000    550.069      0.000       0.155       0.156
==============================================================================
Omnibus:                        1.268   Durbin-Watson:                   1.839
Prob(Omnibus):                  0.530   Jarque-Bera (JB):                1.087
Skew:                          -0.253   Prob(JB):                        0.581
Kurtosis:                       2.252   Cond. No.                         1.00
==============================================================================

Warnings:
[1] Standard Errors assume that the covariance matrix of the errors is correctly specified.

I figured out that by printing

dir(results)

I could get a list of the OLS Summary elements, and I can pull out the residuals of the test no problem like I do here (or the R squared and stuff) but I can't pull out just the durbin watson or just the Jarque Bera. I tried this:

print results.wald_test

But I just get the error:

<bound method OLSResults.wald_test of <statsmodels.regression.linear_model.OLSResults object at 0x0D05B3F0>>

And I can't even find the jarque bera test in the directory of the summary. Any help?

Matt
  • 203
  • 1
  • 4
  • 8
  • That's not an error. You need to _call_ the `wald_test` method. –  Nov 11 '17 at 18:40
  • So you have to run it as a separate step? You can't just retrieve the statistics from the summary? And you still have to pass that method the residuals derived from the summary right? Like: results.wald_test(residuals) That doesn't seem any easier. – Matt Nov 11 '17 at 18:56
  • 2
    The diagnostics results at the bottom table of the summary are only computed for the summary but not stored or attached. https://github.com/statsmodels/statsmodels/blob/master/statsmodels/regression/linear_model.py#L2335 – Josef Nov 11 '17 at 20:27
  • @user333700 ah, I see. That's disappointing but makes sense. I can run the tests separately just fine, just thought there was a faster way. Thanks. – Matt Nov 11 '17 at 21:19
  • Disappointing indeed. Any reason why these diagnostics should not be stored once calculated? – OldSchool Nov 18 '19 at 10:54

1 Answers1

0
import pandas as pd
import statsmodels.api as sm
from statsmodels.stats.stattools import durbin_watson #add this import 


csv = mydata.csv
df = pd.read_csv(csv)
var = df[variable]
year = df['Year']
model = sm.OLS(var,year)
results = model.fit()
summary = results.summary()
dw = float(durbin_watson(results.resid)) # this line will access the durbin watson score 
print(dw)
Emma E
  • 1
  • 1
    As it’s currently written, your answer is unclear. Please [edit] to add additional details that will help others understand how this addresses the question asked. You can find more information on how to write good answers [in the help center](/help/how-to-answer). – Community Dec 14 '22 at 13:19