I've been using Python for regression analysis. After getting the regression results, I need to summarize all the results into one single table and convert them to LaTex (for publication). Is there any package that does this in Python? Something like estout in Stata that gives the following table:
-
Any modern update to this question? There's summary2 which is still quite lacking. – Matthew Gunn May 13 '17 at 23:51
-
3@MatthewGunn This function is far from what you can do with Stata's estout package (or R's package). So what I ended up doing for my workflow is to call a user-defined function in Python that calls Stata to run a do-file in Terminal that run all the regressions and output the table. – Titanic May 15 '17 at 15:07
3 Answers
Well, there is summary_col
in statsmodels
; it doesn't have all the bells and whistles of estout
, but it does have the basic functionality you are looking for (including export to LaTeX):
import statsmodels.api as sm
from statsmodels.iolib.summary2 import summary_col
p['const'] = 1
reg0 = sm.OLS(p['p0'],p[['const','exmkt','smb','hml']]).fit()
reg1 = sm.OLS(p['p2'],p[['const','exmkt','smb','hml']]).fit()
reg2 = sm.OLS(p['p4'],p[['const','exmkt','smb','hml']]).fit()
print summary_col([reg0,reg1,reg2],stars=True,float_format='%0.2f')
===============================
p0 p2 p4
-------------------------------
const -1.03*** -0.01 0.62***
(0.11) (0.04) (0.07)
exmkt 1.28*** 0.97*** 0.98***
(0.02) (0.01) (0.01)
smb 0.37*** 0.28*** -0.14***
(0.03) (0.01) (0.02)
hml 0.77*** 0.46*** 0.69***
(0.04) (0.01) (0.02)
===============================
Standard errors in parentheses.
* p<.1, ** p<.05, ***p<.01
Or here is a version where I add R-Squared and the number of observations:
print summary_col([reg0,reg1,reg2],stars=True,float_format='%0.2f',
info_dict={'N':lambda x: "{0:d}".format(int(x.nobs)),
'R2':lambda x: "{:.2f}".format(x.rsquared)})
===============================
p0 p2 p4
-------------------------------
const -1.03*** -0.01 0.62***
(0.11) (0.04) (0.07)
exmkt 1.28*** 0.97*** 0.98***
(0.02) (0.01) (0.01)
smb 0.37*** 0.28*** -0.14***
(0.03) (0.01) (0.02)
hml 0.77*** 0.46*** 0.69***
(0.04) (0.01) (0.02)
R2 0.86 0.95 0.88
N 1044 1044 1044
===============================
Standard errors in parentheses.
* p<.1, ** p<.05, ***p<.01
Another example, this time showing the use of the model_names
option and regressions where the independent variables vary:
reg3 = sm.OLS(p['p4'],p[['const','exmkt']]).fit()
reg4 = sm.OLS(p['p4'],p[['const','exmkt','smb','hml']]).fit()
reg5 = sm.OLS(p['p4'],p[['const','exmkt','smb','hml','umd']]).fit()
print summary_col([reg3,reg4,reg5],stars=True,float_format='%0.2f',
model_names=['p4\n(0)','p4\n(1)','p4\n(2)'],
info_dict={'N':lambda x: "{0:d}".format(int(x.nobs)),
'R2':lambda x: "{:.2f}".format(x.rsquared)})
==============================
p4 p4 p4
(0) (1) (2)
------------------------------
const 0.66*** 0.62*** 0.15***
(0.10) (0.07) (0.04)
exmkt 1.10*** 0.98*** 1.08***
(0.02) (0.01) (0.01)
hml 0.69*** 0.72***
(0.02) (0.01)
smb -0.14*** 0.07***
(0.02) (0.01)
umd 0.46***
(0.01)
R2 0.78 0.88 0.96
N 1044 1044 1044
==============================
Standard errors in
parentheses.
* p<.1, ** p<.05, ***p<.01
To export to LaTeX use the as_latex
method.
I could be wrong but I don't think an option for t-stats instead of standard errors (like in your example) is implemented.

- 13,332
- 5
- 56
- 38
-
1Could you please let me know how to find this command on Google search? I spent half an hour looking for a command but could not find it. THanks! – Titanic May 10 '14 at 02:47
-
1I don't know ... I forget how I ran across it; Here is a link to the source file on github; it has a docstring: [summary2.py](https://github.com/statsmodels/statsmodels/blob/master/statsmodels/iolib/summary2.py) – Karl D. May 10 '14 at 02:57
-
Great. It would be helpful for others if you include regressions with different explanatory variables. – Titanic May 10 '14 at 17:23
-
-
@KarlD. Great answer, it would be nice if pandas or statsmodels have a texreg or outreg2 like package for this. Academics use these a lot. – user3576212 May 31 '14 at 19:22
-
Is there a way to restrict which of the regressors will be included? I want to use the same approach as in the answer, but have many regressors whose coefficients I do not want to present. Didn't find in the documentation a way to do it – splinter Mar 11 '17 at 22:43
-
@splinter Not as near as I can tell. You could modify the function to do that pretty easily I think or post process the table it creates but it doesn't do it by default. – Karl D. Mar 12 '17 at 01:28
-
1Yep, post processing is easy. It's disappointing though that python is so behind on this kind of stuff – splinter Mar 12 '17 at 11:29
One alternative is Stargazer. To get started quickly, refer to the set of demo tables that Stargazer can produce.

- 1,316
- 1
- 15
- 39
In addition to @Karl D. 's great answer with Statsmodels as_latex
method, you can also check out the pystout
package.
!pip install pystout
import pandas as pd
from sklearn.datasets import load_iris
import statsmodels.api as sm
from pystout import pystout
data = load_iris()
df = pd.DataFrame(data = data.data, columns = data.feature_names)
df.columns = ['s_len', 's_w', 'p_len', 'p_w']
y = df['p_w']
X = df[['s_len', 's_w', 'p_len']]
m1 = sm.OLS(y, X).fit()
X = df[['s_len', 's_w']]
m2 = sm.OLS(y, X).fit()
X = df[['s_len']]
m3 = sm.OLS(y, X).fit()
pystout(models=[m1, m2, m3],
file='test_table.tex',
addnotes=['Note above','Note below'],
digits=2,
endog_names=['petal width', 'petal width', 'petal width'],
varlabels={'const':'Constant',
'displacement':'Disp','mpg':'MPG'},
mgroups={'First Group':[1,2],'Second Group':3},
modstat={'nobs':'Obs','rsquared_adj':'Adj. R\sym{2}','fvalue':'F-stat'}
)
Don't spend hours like me trying to print out pystout
, the LateX output is directly written on the .tex
document you pass for file
.

- 137
- 1
- 9