0

I am running a basic regression:

df = pd.DataFrame(playerdata, columns = cols_of_interest)
target = pd.DataFrame(playerdata, columns = ['mins'])


X = df[cols_of_interest]
y = target["mins"]

# Fit and make the predictions by the model
model = sm.OLS(y, X).fit()
predictions = model.predict(X)

model.params seems to have the info that I want, I want to pretty much output the regression equations (coefficients, r^2, and t/p values) to a csv. I can't seem to figure out how to, I did try .to_csv and what I found here link

elcunyado
  • 351
  • 1
  • 2
  • 11
  • How does your `model.params` looks. In a toy example, I get a list containing 3 numbers. Or do you want `model.summary()`? – Grayrigel Sep 09 '20 at 22:21

1 Answers1

1

Some Useful Previous Answers Are Here: helpful Stack Overflow Examples On this

Here is an example I pieced together to show what works for me. (I used an example from statsmodels then converted summary to a df, then output the frame to csv)

import numpy as np
import pandas as pd 
import statsmodels.api as sm

nsample = 100
x = np.linspace(0, 10, 100)
X = np.column_stack((x, x**2))
beta = np.array([1, 0.1, 10])
e = np.random.normal(size=nsample)

X = sm.add_constant(X)
y = np.dot(X, beta) + e

model = sm.OLS(y, X)
results = model.fit()
print(results.summary())

The output that we want to convert to a dataframe (for reference) is the results.summary():

     OLS Regression Results                            
==============================================================================
Dep. Variable:                      y   R-squared:                       1.000
Model:                            OLS   Adj. R-squared:                  1.000
Method:                 Least Squares   F-statistic:                 4.101e+06
Date:                Wed, 09 Sep 2020   Prob (F-statistic):          1.08e-239
Time:                        18:14:11   Log-Likelihood:                -145.54
No. Observations:                 100   AIC:                             297.1
Df Residuals:                      97   BIC:                             304.9
Df Model:                           2                                         
Covariance Type:            nonrobust                                         
==============================================================================
                 coef    std err          t      P>|t|      [0.025      0.975]
------------------------------------------------------------------------------
const          0.6968      0.310      2.250      0.027       0.082       1.312
x1             0.3067      0.143      2.143      0.035       0.023       0.591
x2             9.9795      0.014    720.523      0.000       9.952      10.007
==============================================================================
Omnibus:                        1.587   Durbin-Watson:                   1.878
Prob(Omnibus):                  0.452   Jarque-Bera (JB):                1.271
Skew:                           0.055   Prob(JB):                        0.530
Kurtosis:                       2.459   Cond. No.                         144.
==============================================================================

Here is how to convert it to a dataframe, then ultimately a csv

df1 = pd.DataFrame(results.summary().tables[1])
df2 = pd.DataFrame(results.summary2().tables[1])

df1.to_csv('summary.csv')
df2.to_csv('summary2.csv')

df1

       0           1          2          3       4          5          6
0               coef    std err          t   P>|t|     [0.025     0.975]
1  const      0.6968      0.310      2.250   0.027      0.082      1.312
2     x1      0.3067      0.143      2.143   0.035      0.023      0.591
3     x2      9.9795      0.014    720.523   0.000      9.952     10.007

and df2

          Coef.  Std.Err.           t          P>|t|    [0.025     0.975]
const  0.696846  0.309698    2.250083   2.670308e-02  0.082181   1.311510
x1     0.306671  0.143135    2.142533   3.465348e-02  0.022588   0.590754
x2     9.979496  0.013850  720.523099  1.175226e-182  9.952007  10.006985

Both summary1.csv and summary2.csv are stored.

NOTE: If you want to add values or make custom frames, you can look at the directory of the results to see what is available.

Output of dir(results)

['HC0_se', 'HC1_se', 'HC2_se', 'HC3_se', '_HCCM', '__class__', '__delattr__', '__dict__', '__dir__', '__doc__', '__eq__', '__format__', '__ge__', '__getattribute__', '__gt__', '__hash__', '__init__', '__init_subclass__', '__le__', '__lt__', '__module__', '__ne__', '__new__', '__reduce__', '__reduce_ex__', '__repr__', '__setattr__', '__sizeof__', '__str__', '__subclasshook__', '__weakref__', '_cache', '_data_attr', '_get_robustcov_results', '_is_nested', '_use_t', '_wexog_singular_values', 'aic', 'bic', 'bse', 'centered_tss', 'compare_f_test', 'compare_lm_test', 'compare_lr_test', 'condition_number', 'conf_int', 'conf_int_el', 'cov_HC0', 'cov_HC1', 'cov_HC2', 'cov_HC3', 'cov_kwds', 'cov_params', 'cov_type', 'df_model', 'df_resid', 'diagn', 'eigenvals', 'el_test', 'ess', 'f_pvalue', 'f_test', 'fittedvalues', 'fvalue', 'get_influence', 'get_prediction', 'get_robustcov_results', 'initialize', 'k_constant', 'llf', 'load', 'model', 'mse_model', 'mse_resid', 'mse_total', 'nobs', 'normalized_cov_params', 'outlier_test', 'params', 'predict', 'pvalues', 'remove_data', 'resid', 'resid_pearson', 'rsquared', 'rsquared_adj', 'save', 'scale', 'ssr', 'summary', 'summary2', 't_test', 't_test_pairwise', 'tvalues', 'uncentered_tss', 'use_t', 'wald_test', 'wald_test_terms', 'wresid']
KellyJayTX
  • 61
  • 6