2

I have some financial time series data that I would like to calculate the rolling cumulative product with a variable window size.

What I am trying to accomplish is using the following formula but instead of having window fixed at 12, I would like to use the value stored in the last column of the dataframe labeled 'labels_y' which will change over time.

df= (1 + df).rolling(window=12).apply(np.prod, raw=True) - 1

A sample of the data:

Out[102]: 
            div_yield  earn_variab    growth  ...     value  volatility  labels_y
date                                          ...                                
2004-02-23  -0.001847     0.003252 -0.001264  ...  0.004368   -0.004490       2.0
2004-02-24  -0.001668     0.007404  0.002108  ... -0.006122    0.008183       2.0
2004-02-25  -0.003272     0.004596  0.001283  ... -0.002057    0.005912       3.0
2004-02-26   0.001818    -0.003397 -0.003190  ...  0.001327   -0.003908       3.0
2004-02-27  -0.002838     0.009879  0.000808  ...  0.000350    0.010557       3.0
[5 rows x 11 columns]

and the final result should look like:

Out[104]: 
            div_yield  earn_variab    growth  ...     value  volatility  labels_y
date                                          ...                                
2004-02-23        NaN          NaN       NaN  ...       NaN         NaN       NaN
2004-02-24  -0.003512     0.010680  0.000841  ... -0.001781    0.003656       8.0
2004-02-25  -0.006773     0.015325  0.002125  ... -0.003834    0.009589      35.0
2004-02-26  -0.003126     0.008596  0.000193  ... -0.006851    0.010180      47.0
2004-02-27  -0.004294     0.011075 -0.001104  ... -0.000383    0.012559      63.0
[5 rows x 11 columns]

Rows 1 and 2 are calculated with a 2 day rolling window and rows 3, 4 and 5 use a 3 day window

I have tried using

def get_window(row):
    return (1 + row).rolling(window=int(row['labels_y'])).apply(np.prod, raw=True) - 1

df = df.apply(get_window, axis=1)

I realize that calculates the cumupative product in the wrong direction. I am struggling on how to get this to work.

Any help would be hugely appreciated.

Thanks

Kilgore
  • 65
  • 9

1 Answers1

2

def get_window(row, df):
    return (1 + df).rolling(window=int(row['labels_y'])).apply(np.prod, raw=True).loc[row.name]-1


result = df1.apply(get_window, axis=1, df=df1)

Does this do the trick? Highly inefficient, but I don't see another way except for tedious for-loops.

Thomas Kimber
  • 10,601
  • 3
  • 25
  • 42
Alex
  • 58
  • 4
  • Thanks Alex, works perfectly for what I need. I just edited your code for any future users that look at this so that all the information is in there (the second argument to the function was missing) – Kilgore Jul 17 '19 at 13:48