I'm trying to apply a custom function to a rolling window but it gives a KeyError and I'm not sure why or how to fix it. I've looked here and here, but the answers didn't solve the issue.
Here is my code to reproduce the error:
import pandas as pd
import numpy as np
from sklearn.feature_selection import chi2
def return_chi2(df):
return chi2(df['signal'].to_numpy().reshape(len(df['signal'].index),1),
df['PnL_binary'].to_numpy().reshape(len(df['PnL_binary'].index),1))[1][0]
df = pd.DataFrame()
df['signal'] = [0,0,0,1,1,1,0,0,0,1]
df['PnL_binary'] = [0,0,0,1,1,1,0,0,0,0]
return_chi2(df)
>>>0.04953461343562649
So far so good, the function works and returns a chi-squared. The issue is applying it to a rolling window:
df.rolling(3).apply(return_chi2)
return self._engine.get_value(s, k, tz=getattr(series.dtype, "tz", None))
File "pandas\_libs\index.pyx", line 80, in pandas._libs.index.IndexEngine.get_value
File "pandas\_libs\index.pyx", line 90, in pandas._libs.index.IndexEngine.get_value
File "pandas\_libs\index.pyx", line 135, in pandas._libs.index.IndexEngine.get_loc
File "pandas\_libs\index_class_helper.pxi", line 109, in pandas._libs.index.Int64Engine._check_type
KeyError: 'signal'
I believe the error is related to the apply function attempting to look for 'signal' as an index value rather than a column. I tried:
df.rolling(3).apply(return_chi2, axis=1)
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
TypeError: apply() got an unexpected keyword argument 'axis'
Not really sure where to go from here? I could use something like iterrows and roll through the whole df manually slicing the window but it doesn't seem very pythonic - there should be a better way to do this? Would appreciate any help to get this going?