I have a daily return data of a number of stocks that looks like:
Stock A Stock B Stock C Market
date
1987-02-02 0.01 0.02 0.02 0.01
1987-02-03 0.02 0.03 0.02 0.02
1987-02-04 0.03 0.01 0.01 0.03
1987-02-05 0.04 0.03 0.05 0.04
I want to calculate 30 day regression of Stock A, B, C on market, but only at the end of each month, i.e. at 1987-02-28, 1987-03-31... And then save the regression results in two matrices (one for the constant term, and one for the coefficient):
Stock A Stock B Stock C
date
1987-02-28 const const const
1987-03-31 const const const
1987-04-30 const const const
Stock A Stock B Stock C
date
1987-02-28 coeff coeff coeff
1987-03-31 coeff coeff coeff
1987-04-30 coeff coeff coeff
So far what I did was to create an indicator for "end of month" and then loop over all rows and columns of the date:
loop over columns:
loop over rows:
if end of month is true:
regress Stock on Market using past 30 days data
save the result
Given I have a lot of stocks (7000+) over 50 years, this way is very slow. I am wondering if anyone has worked on similar problems before and has faster way of implementing this? Any tip on how to improve the speed or efficiency would be greatly appreciated.