1

I have following df:

          Close_x   Close_y
key_0                         
2017-10-23       NaN       NaN
2017-10-24 -0.147631  0.161791
2017-10-25  0.044194 -0.466305
2017-10-26 -0.069876  0.127095
2017-10-27  0.142261  0.807302
2017-10-30 -0.178176 -0.319247
2017-10-31  0.108544  0.094446
2017-11-01 -0.136536  0.159211
2017-11-02 -0.204280  0.018997
2017-11-03  0.324777  0.309708
2017-11-06  0.049001  0.127125
2017-11-07  0.108391 -0.018910
2017-11-08  0.167624  0.144365
2017-11-09 -0.183357 -0.376189
2017-11-10  0.126741 -0.089764
2017-11-13  0.523946  0.098363
2017-11-14 -0.481367 -0.230961
2017-11-15 -0.148953 -0.552568
2017-11-16 -0.320806  0.819606
2017-11-17  0.172988 -0.262596
2017-11-20 -0.242568  0.127568
2017-11-21 -0.220614  0.654114
2017-11-22 -0.287271 -0.075026
2017-11-24  0.483940  0.205610
2017-11-27  0.177181 -0.038426
2017-11-28 -0.005628  0.984851
2017-11-29  0.537868 -0.036923
2017-11-30  0.112756  0.819095
2017-12-01  0.018372 -0.202453
2017-12-04 -0.362582 -0.105216
...              ...       ...
2018-09-07 -0.176824 -0.221334
2018-09-10 -0.106907  0.189783
2018-09-11 -0.553854  0.373984
2018-09-12 -0.410831  0.035667
2018-09-13 -0.566335  0.528225
2018-09-14 -0.157859  0.027548
2018-09-17 -0.111232 -0.556972
2018-09-18  0.163057  0.536901
2018-09-19 -0.170732  0.125327
2018-09-20  0.241025  0.784059
2018-09-21  0.065865 -0.036853
2018-09-24 -0.321969 -0.351569
2018-09-25  0.489287 -0.130510
2018-09-26  0.197137 -0.328928
2018-09-27  0.535727  0.276329
2018-09-28 -0.151688 -0.000687
2018-10-01 -0.350278  0.364111
2018-10-02 -0.148503 -0.039669
2018-10-03 -0.252355  0.071152
2018-10-04 -0.104687 -0.816948
2018-10-05 -0.022230 -0.552798
2018-10-08 -0.123084 -0.039512
2018-10-09 -0.425363 -0.141790
2018-10-10  0.408815 -3.286423
2018-10-11 -0.629016 -2.057301
2018-10-12 -0.717824  1.420620
2018-10-15 -0.461052 -0.590498
2018-10-16 -0.385450  2.149560
2018-10-17 -1.040515 -0.025266
2018-10-18 -0.496977 -0.524702 

I'm trying to run a rolling(21).cov(close_x,close_y) but keep getting errors. I also want to run a rolling(21).var(close_x) which I have succeeded in getting. The ultimate goal is to create a df (Beta) that equals cov(close_x,close_y)/var(close_x) over a rolling 21-period window. Little help?

Mabel Villalba
  • 2,538
  • 8
  • 19
Surfcat
  • 23
  • 1
  • 7
  • 1
    Does this answer your question? [Python pandas calculate rolling stock beta using rolling apply to groupby object in vectorized fashion](https://stackoverflow.com/questions/34802972/python-pandas-calculate-rolling-stock-beta-using-rolling-apply-to-groupby-object) – feetwet Oct 28 '22 at 23:09

1 Answers1

3

To obtain the covariance between the columns you can do:

 df.rolling(21).cov()

                     Close_x   Close_y
key_0                                 
2017-10-23 Close_x       NaN       NaN
           Close_y       NaN       NaN
2017-10-24 Close_x       NaN       NaN
           Close_y       NaN       NaN
2017-10-25 Close_x       NaN       NaN
           Close_y       NaN       NaN
2017-10-26 Close_x       NaN       NaN
           Close_y       NaN       NaN
2017-10-27 Close_x       NaN       NaN
           Close_y       NaN       NaN
2017-10-30 Close_x       NaN       NaN
           Close_y       NaN       NaN
2017-10-31 Close_x       NaN       NaN
           Close_y       NaN       NaN
2017-11-01 Close_x       NaN       NaN
           Close_y       NaN       NaN
2017-11-02 Close_x       NaN       NaN
           Close_y       NaN       NaN
2017-11-03 Close_x       NaN       NaN
           Close_y       NaN       NaN
2017-11-06 Close_x       NaN       NaN
           Close_y       NaN       NaN
2017-11-07 Close_x       NaN       NaN
           Close_y       NaN       NaN
2017-11-08 Close_x       NaN       NaN
           Close_y       NaN       NaN
2017-11-09 Close_x       NaN       NaN
           Close_y       NaN       NaN
2017-11-10 Close_x       NaN       NaN
           Close_y       NaN       NaN
2017-11-13 Close_x       NaN       NaN
           Close_y       NaN       NaN
2017-11-14 Close_x       NaN       NaN
           Close_y       NaN       NaN
2017-11-15 Close_x       NaN       NaN
           Close_y       NaN       NaN
2017-11-16 Close_x       NaN       NaN
           Close_y       NaN       NaN
2017-11-17 Close_x       NaN       NaN
           Close_y       NaN       NaN
2017-11-20 Close_x       NaN       NaN
           Close_y       NaN       NaN
2017-11-21 Close_x  0.055186  0.004344
           Close_y  0.004344  0.139974
2017-11-22 Close_x  0.057800  0.006661
           Close_y  0.006661  0.140316
2017-11-24 Close_x  0.070428  0.011944
           Close_y  0.011944  0.126975

To extract the rolling covariance betweem Close_x and Close_y just do:

df.rolling(21).cov().unstack()['Close_x']['Close_y']

key_0
2017-10-23         NaN
2017-10-24         NaN
2017-10-25         NaN
2017-10-26         NaN
2017-10-27         NaN
2017-10-30         NaN
2017-10-31         NaN
2017-11-01         NaN
2017-11-02         NaN
2017-11-03         NaN
2017-11-06         NaN
2017-11-07         NaN
2017-11-08         NaN
2017-11-09         NaN
2017-11-10         NaN
2017-11-13         NaN
2017-11-14         NaN
2017-11-15         NaN
2017-11-16         NaN
2017-11-17         NaN
2017-11-20         NaN
2017-11-21    0.004344
2017-11-22    0.006661
2017-11-24    0.011944
2017-11-27    0.011000
2017-11-28    0.005615
2017-11-29   -0.001627
2017-11-30    0.001503
2017-12-01    0.001986
2017-12-04    0.005164
Name: Close_y, dtype: float64

The variance of Close_x can be obtained by:

df['Close_x'].to_frame().rolling(21).var()

             Close_x
key_0               
2017-10-23       NaN
2017-10-24       NaN
2017-10-25       NaN
2017-10-26       NaN
2017-10-27       NaN
2017-10-30       NaN
2017-10-31       NaN
2017-11-01       NaN
2017-11-02       NaN
2017-11-03       NaN
2017-11-06       NaN
2017-11-07       NaN
2017-11-08       NaN
2017-11-09       NaN
2017-11-10       NaN
2017-11-13       NaN
2017-11-14       NaN
2017-11-15       NaN
2017-11-16       NaN
2017-11-17       NaN
2017-11-20       NaN
2017-11-21  0.055186
2017-11-22  0.057800
2017-11-24  0.070428
2017-11-27  0.071921
2017-11-28  0.070846
2017-11-29  0.083070
2017-11-30  0.083106
2017-12-01  0.081725
2017-12-04  0.086686

Then the result you want can be achieved by:

df_cov = df.rolling(21).cov().unstack()['Close_x']['Close_y']
df_var = df['Close_x'].to_frame().rolling(21).var()

(df_cov/(df_var.T)).T

             Close_x
key_0               
2017-10-23       NaN
2017-10-24       NaN
2017-10-25       NaN
2017-10-26       NaN
2017-10-27       NaN
2017-10-30       NaN
2017-10-31       NaN
2017-11-01       NaN
2017-11-02       NaN
2017-11-03       NaN
2017-11-06       NaN
2017-11-07       NaN
2017-11-08       NaN
2017-11-09       NaN
2017-11-10       NaN
2017-11-13       NaN
2017-11-14       NaN
2017-11-15       NaN
2017-11-16       NaN
2017-11-17       NaN
2017-11-20       NaN
2017-11-21  0.078712
2017-11-22  0.115246
2017-11-24  0.169587
2017-11-27  0.152945
2017-11-28  0.079259
2017-11-29 -0.019581
2017-11-30  0.018079
2017-12-01  0.024298
2017-12-04  0.059574
Mabel Villalba
  • 2,538
  • 8
  • 19
  • @MabelVillalba question for you as this is a great solution. I'm a bit confused on the final formula you have `(df_cov/(df_var.T)).T` and specifically the `df_var.T` - why is it that we need to transpose this first? – antonio_zeus Feb 01 '19 at 19:46
  • @MabelVillalba I believe this could also work too `df_cov / df_var.iloc[:,0])` to isolate a column in the dataframe. – antonio_zeus Feb 01 '19 at 19:53
  • It can be done that way also. It was just a shortcut to avoid those conversions. `df_var` is a dataframe (shape `(N, )`) and `df_cov` a pandas series (shape `(N,1)`). If you just do `df_cov/(df_var))`, you'll have a result of shape `(N,N)`, as a multiplication would do. That's why I transposed it. That way each element is multiplied one. But we could have done it using `df_cov / df_var.iloc[:,0])` also, it does not matter. That way would even be safer since we would have two series, with the same name and aligned index. – Mabel Villalba Feb 17 '19 at 11:35
  • Have a look at [`pd.rolling_cov()`](https://pandas.pydata.org/pandas-docs/version/0.17.0/generated/pandas.rolling_cov.html). It may turn out to be a much simpler solution. – Dr_Zaszuś Nov 22 '22 at 15:40
  • This syntaxis has been deprecated though. The new one seems to be `df['Close_X'].rolling().cov(df['Close_y']`, see [here](https://www.statology.org/rolling-correlation-pandas/) – Dr_Zaszuś Nov 22 '22 at 15:44