I try to calculate for each element in a Dataframe
df1
the cumulative product over x
days (for example 3). Is there an efficient way to do that? So far, I only know to generate a normal cumulative product over the rows, without fixing the lookback period to x
days (=df2
). For example:
- 2022-01-02 ID1: 1.0528 = (ignoring NaN) 0.94 * 1.12
- 2022-01-05 ID1: 1.2002 = 0.94 * 1.12 * 1.14
- 2022-01-09 ID1: 1.4045 = 1.12 * 1.14 * 1.10
Optimally, it would also ignore NaN values and calculate the cumulative return over the remaining numbers. So far I only know how to compute that by creating a copy and shifting it, but I would like to do calculate the cumulative product for many columns over 50 rows, which would not be efficient.
df1:
ID1 ID2
Date
2022-01-02 NaN 0.95
2022-01-05 0.94 0.98
2022-01-09 1.12 NaN
2022-01-10 1.14 1.02
2022-01-11 1.10 1.00
2022-01-12 0.92 0.82
df2:
ID1 ID2
Date
2022-01-02 1.0528 0.9309
2022-01-05 1.2002 0.9996
2022-01-09 1.4045 1.0200
2022-01-10 1.1537 0.8364
2022-01-11 1.0120 0.8200
2022-01-12 0.9200 0.8200
For reproducability:
import pandas as pd
import numpy as np
df1 = pd.DataFrame({
'Date':['2022-01-02', '2022-01-05', '2022-01-09', '2022-01-10', '2022-01-11', '2022-01-12'],
'ID1':[np.nan, 0.94, 1.12, 1.14, 1.1, 0.92],
'ID2':[0.95, 0.98, np.nan, 1.02, 1, 0.82]})
df1 = df1.set_index('Date')
Thanks a lot for you suggestion!