I've been creating various regressions using Statsmodels. I took data, shaped it into Pandas dataframes, and then ran several models on the data. I'm now struggling to output all of those regressions as a CSV file. My goal is to have all of my "regression data" (ie the coefficients, intercepts, standard errors, etc for each control/variable, as well as the number of observations and a few other datapoints) on one axis, with the title of each regression forming the other axis.
So far I've tried multiple approaches, with one looking the most promising. That method has been using
results = FoodPriceReg(PriceChange, RightHandVars)
regexport = RegToCSV(results)
return regexport
to turn the printed summary into a CSV file. I then use
for com in commodity:
RegOut = RegLoop(com)
regressions = pd.DataFrame(RegOut)
name = 'regressions/' + com[2]
SaveFrame(regressions, name)
to output the regressions as a CSV + .dta file for each food category.
I've also tried both sorting those CSV files into nested lists and converting them to dataframes and trying to work with them. The biggest issue I've had is that the CSV output is very rough and challenging to work with. It's not organized like other Pandas dataframes and I've been unable to come up with a reasonably simple solution to get all of the data in the CSV sorted so that if you open it in Excel, each piece of information would end up in it's own cell.
To clarify, right now each cell of my final CSV output looks like
Dep. Variable: ,ParboiledCoarseRice2014, R-squared: , 0.010
Model: ,OLS , Adj. R-squared: , -0.000
Method: ,Least Squares , F-statistic: , 0.9711
, coef , std err , t ,P>|t| , [0.025 , 0.975]
Intercept , 28.5204, 0.216, 131.855, 0.000, 28.095, 28.945
Cash , 4.5696, 0.501, 9.112, 0.000, 3.584, 5.555
Food , 4.1321, 0.501, 8.240, 0.000, 3.147, 5.117
FoodCash , 4.2496, 0.501, 8.474, 0.000, 3.264, 5.235
CashTraining, 5.2596, 0.675, 7.787, 0.000, 3.933, 6.587
FoodTraining, 5.8696, 0.675, 8.691, 0.000, 4.543, 7.197
Control , 4.4396, 0.501, 8.853, 0.000, 3.454, 5.425
whereas I want each piece of information to be it's own row, like:
Model: ParboiledCoarseRice2014 ~ Treatment Dummies
R-squared: 0.010
Cash Coef: 4.5696
Cash Std Err: 0.501
I'm thinking that I'm missing something fundamental with working with Statsmodels, as the output of regressions is sparsely documented but seems essential to get much use out of the package.