Using Pandas 1.1.5, I have a test DataFrame like the following:
import numpy as np
import pandas as pd
df = pd.DataFrame({'id': ['a0','a0','a0','a1','a1','a1','a2','a2'],
'a': [4,5,6,1,2,3,7,9],
'b': [3,4,5,3,2,4,1,3],
'c': [7,4,3,8,9,7,4,6],
'denom_a': [7,8,9,7,8,9,7,8],
'denom_b': [10,11,12,10,11,12,10,11]})
I would like to apply the following custom aggregate function on a rolling window where the function's calculation depends on the column name as so:
def custom_func(s, df, colname):
if 'a' in colname:
denom = df.loc[s.index, "denom_a"]
calc = s.sum() / np.max(denom)
elif 'b' in colname:
denom = df.loc[s.index, "denom_b"]
calc = s.sum() / np.max(denom)
else:
calc = s.mean()
return calc
df.groupby('id')\
.rolling(2, 1)\
.apply(lambda x: custom_func(x, df, x.name))
This results in TypeError: argument of type 'NoneType' is not iterable
because the windowed subsets of each column do not retain the names of the original df
columns. That is, x.name
being passed in as an argument is in fact passing None
rather than a string of the original column name.
Is there some way of making this approach work (say, retaining the column name being acted on with apply and passing that into the function)? Or are there any suggestions for altering it? I consulted the following reference for having the custom function utilize multiple columns within the same window calculation, among others: