0

I am calculating market beta using daily data with pandas.DataFrame. That is, I want to calculate variances of market return and covariances between market return and individual stock return using 252 days window with 200 minimum observation conditions with groupby. Beta is Var(market_return)/Cov(market_return, stock_return). First, I used unconditional groupby to obtain the beta value, which means that I calculate the variances and covariances for every day of my data sample. However, then, I realize that calculating all betas consumes too much time and is wasteful. This is because only end-of-the-month data will be used. For example, even if betas are calculated on 1st Jan, 2nd Jan, ..., and 31st Jan, only the beta of 31st Jan will be used. Therefore, I want to know if there is any way to run my groupby code conditionally.

For example, my output is as follows using 252 days window with 200 minimum observation groupby.

stock_key date var(market_return) covar(market_return, stock_return)
A 2012-01-26 9.4212 -4.23452
A 2012-01-27 9.3982 -4.18421
A 2012-01-28 9.1632 -4.33552
A 2012-01-29 9.0456 -4.55831
A 2012-01-30 9.2231 -4.92373
A 2012-01-31 9.0687 -4.04133
...
A 2012-02-27 8.9345 -4.72344
A 2012-02-28 9.0010 -4.82349
...
B 2012-01-26 4.8456 -1.42325
B 2012-01-27 4.8004 -1.18421
B 2012-01-28 4.0983 -1.02842
B 2012-01-29 4.9465 -1.13834
B 2012-01-30 4.7354 -1.63450
B 2012-01-31 4.1945 -1.18234

I want to know is there any way to get result as follows.

stock_key date var(market_return) covar(market_return, stock_return)
A 2012-01-31 9.0687 -4.04133
A 2012-02-28 9.0010 -4.82349
B 2012-01-31 4.1945 -1.18234

Thank you for reading my question.

+ I add my code as follows. Here, PERMNO is the id of stocks.

dtmpPair = dlongPair[['PERMNO','dayMktmRF','eadjret']]
dgrpPair = dtmpPair.groupby(['PERMNO']).rolling(252, min_periods = 200)
dgrpCov = dgrpPair.cov().unstack()

ddenom = dgrpCov['dayMktmRF']['dayMktmRF'].reset_index()
ddenom = ddenom[['PERMNO','dayMktmRF']]
ddenom['date'] = dlongPair['date']
ddenom.columns = ['PERMNO','MktVar','date']

dnumer = dgrpCov['dayMktmRF']['eadjret'].reset_index()
dnumer = dnumer[['PERMNO','eadjret']]
dnumer['date'] = dlongPair['date']
dnumer.columns = ['PERMNO','Cov','date']

ddfBeta = dnumer.merge(ddenom, on = ['PERMNO','date'])
ddfBeta['beta_daily'] = ddfBeta['Cov'] / ddfBeta['MktVar']
ddfBeta = ddfBeta[ddfBeta['beta_daily'].notnull()]
ddfBeta['month'] = ddfBeta['date'].dt.month
ddfBeta['year'] = ddfBeta['date'].dt.year
beta_daily = ddfBeta[['date','PERMNO','year','month','beta_daily']]

Here, dlongPair dataframe consists of data as follows. enter image description here

ts10
  • 77
  • 6
  • Does this answer your question? [Python Pandas Conditional Sum with Groupby](https://stackoverflow.com/questions/17266129/python-pandas-conditional-sum-with-groupby) – ffrosch Dec 07 '22 at 08:13
  • This looks more like a filter than a groupby to me. Am I correct in thinking you also would get the result you want by checking each date for being the last day of any month? – Lewistrick Dec 07 '22 at 08:27
  • I think my question is different from the question cited by @ffrosch. This is because if I follow the suggestion of the question or do filtering, variance and covariance are calculated over monthly observation. That is, beta is not obtained from 252 days window but only 12 days of end-of-the-month observations. – ts10 Dec 07 '22 at 09:57
  • For the same reason I mentioned above, I think it is different from just filtering. I do not want to filter my results after calculating on all day's observations. @Lewistrick. – ts10 Dec 07 '22 at 09:58

1 Answers1

0

Without using groupby we can check if the date in the row is the last day of the month.

df['date']=pd.to_datetime(df['date']) #string to datetime

#Is the date in the row the last day of that month?
dfx=df[df['date']  - pd.offsets.Day() + pd.offsets.MonthEnd(1)==df['date']]

Output:

   stock_key       date  var(market_return)  covar(market_return, stock_return)
5          A 2012-01-31              9.0687                            -4.04133
15         B 2012-01-31              4.1945                            -1.18234

Note: 2012-02's last day is 29.

Bushmaster
  • 4,196
  • 3
  • 8
  • 28
  • Thank you for your answer. However, I think this code seems to filter results 'after' calculating all results. I want to calculate all results because it consumes too much time to compute data I will not use in the future. I want to know if there is any way to put restrictions on groupby operation 'before' calculating results. – ts10 Dec 07 '22 at 10:02
  • I don't know the format of the data you are using to calculate. But I think you can apply the code that checks the last day of the month to the raw data. – Bushmaster Dec 07 '22 at 10:10
  • Sorry there was an typo. 'I do not want to calculate all results because it consumes too much time' is right. Yes the code allows me to identify end of the month obsevation. Thank you. My data is panel data. For each day, for each stock, daily return of the stock and daily return of market are recorded. I'm calculating covariance of each stock using historical 252 days return data of market and stock. Using just groupby allows me to calculate beta for every day for every stock. However, only every end-of-the-month covariance is necessary for me. – ts10 Dec 07 '22 at 10:21
  • well, can you add the code you made the calculation(covariance and groupby) to the question? – Bushmaster Dec 07 '22 at 10:32
  • Sorry for the late comments, @Bushmaster I added my code in my question. Thank you. I want to know is there any way to skip groupby operation for 1st Jan, 2nd Jan, .. 31 Jan, 1 Feb, 2nd Feb, ... 1st Mar, 2nd Mar, 3rd Mar, ..., 1st Dec, 2nd Dec, ... – ts10 Dec 08 '22 at 02:33
  • Before the block of the code implemented in the question, I sorted dlongPair by [PERMNO, date] – ts10 Dec 08 '22 at 02:36