I have been using Stata and eViews for sometime, and have very limited knowledge or R and that's why I need some help. I am estimating the Fama French model for stocks from 2010 to 2016, and so I need to run a regression for each stock. Each regression will have excess return (ereturn) as the dependent variable, and then MKTRF, HML, and SMB as the independent variables. The stock code is stored in a variable called permno. Most of the questions I have seen asked here were about looping a regression with different variables, but in my case, the variables do not change, but the observations do. I need to save the coefficients, and the R2 value for each regression. I hope somebody can help.
Asked
Active
Viewed 243 times
-1
-
1I'm not sure I understand the problem you're having. What is your question? – WillardSolutions Dec 01 '16 at 21:01
-
I will try to clarify. Know how to have 1 regression for the whole sample, but I need to have separate regressions, one for each stock. The regression equation is the same: eReturn=c+b1*MKTRF+b2*HML+b3*SMB. How can I create a loop and run the regression from within the loop and store the results? Thanks – Ossama Elhadary Dec 01 '16 at 22:20
1 Answers
0
Basing off your variables, I presume you want to run a regression for observations grouped by "permno"? Here is what you could do.
# Create list of subsets
dfList = split(df, permno)
The split()
function splits your "df" by the group "permno" assuming that permno is a factor. Now use lapply to run the same model for each subsets.
regSummaryList = lapply(dfList, function(x) {
lm(eReturn ~ MLTRF + HML + SMB, data = x)
})
This returns a list of regression objects for each regression. You can then extract the coefficients and R square with the following:
coefList = lapply(regSummaryList, coef)
R2 = sapply(regSummaryList, function(x) summary(x)$r.squared)
coefList
would be a list of coefficients, whereas R2
would be a vector of R2's.

acylam
- 18,231
- 5
- 36
- 45