I have some financial time series data that I would like to calculate the rolling cumulative product with a variable window size.
What I am trying to accomplish is using the following formula but instead of having window fixed at 12, I would like to use the value stored in the last column of the dataframe labeled 'labels_y' which will change over time.
df= (1 + df).rolling(window=12).apply(np.prod, raw=True) - 1
A sample of the data:
Out[102]:
div_yield earn_variab growth ... value volatility labels_y
date ...
2004-02-23 -0.001847 0.003252 -0.001264 ... 0.004368 -0.004490 2.0
2004-02-24 -0.001668 0.007404 0.002108 ... -0.006122 0.008183 2.0
2004-02-25 -0.003272 0.004596 0.001283 ... -0.002057 0.005912 3.0
2004-02-26 0.001818 -0.003397 -0.003190 ... 0.001327 -0.003908 3.0
2004-02-27 -0.002838 0.009879 0.000808 ... 0.000350 0.010557 3.0
[5 rows x 11 columns]
and the final result should look like:
Out[104]:
div_yield earn_variab growth ... value volatility labels_y
date ...
2004-02-23 NaN NaN NaN ... NaN NaN NaN
2004-02-24 -0.003512 0.010680 0.000841 ... -0.001781 0.003656 8.0
2004-02-25 -0.006773 0.015325 0.002125 ... -0.003834 0.009589 35.0
2004-02-26 -0.003126 0.008596 0.000193 ... -0.006851 0.010180 47.0
2004-02-27 -0.004294 0.011075 -0.001104 ... -0.000383 0.012559 63.0
[5 rows x 11 columns]
Rows 1 and 2 are calculated with a 2 day rolling window and rows 3, 4 and 5 use a 3 day window
I have tried using
def get_window(row):
return (1 + row).rolling(window=int(row['labels_y'])).apply(np.prod, raw=True) - 1
df = df.apply(get_window, axis=1)
I realize that calculates the cumupative product in the wrong direction. I am struggling on how to get this to work.
Any help would be hugely appreciated.
Thanks