I'm having some trouble to use .apply
or .aggregate
in pandas on a rolling basis (assuming of course that it is the right way to solve my problem). Let's assume I have a dataframe with two columns A and B. I would like to create a column C that will contain the rolling mean of B if A is having the value of 1. And more generally I would like to be able to apply a custom function on a rolling basis with some conditions involving several columns of the dataframe (e.g. rolling sum of column A when B > x and/or C = y etc.).
import pandas as pd
import numpy as np
df2 = pd.DataFrame({'A':[1,1,1,0,0,0,1,1,1],'B': [50,40,50,-20,20,10,10,-5,-2]}, index = np.arange(9))
Desired output would be (assuming a rolling window of 3):
df2 = pd.DataFrame({'A':[1,1,1,0,0,0,1,1,1],'B': [50,40,50,-20,20,10,10,-5,-2],\
'C': [np.nan, np.nan, 46.67, 45, 50, np.nan, 10, 2.50, 1]}, index = np.arange(9))
I have tried to define a function mean_1
as follows:
def mean_1(x):
return np.where(x['A'] == 1, np.mean(x['B']), np.nan)
df2['C'] = df2.rolling(3).apply(mean_1)
and got the error: 'Series' object has no attribute 'A'
I guess it is due related to the raw = False
in the documentation
Thanks