0

I'm trying to apply a custom function to a rolling window but it gives a KeyError and I'm not sure why or how to fix it. I've looked here and here, but the answers didn't solve the issue.

Here is my code to reproduce the error:

import pandas as pd 
import numpy as np
from sklearn.feature_selection import chi2

def return_chi2(df):                                                               
    return chi2(df['signal'].to_numpy().reshape(len(df['signal'].index),1), 
        df['PnL_binary'].to_numpy().reshape(len(df['PnL_binary'].index),1))[1][0]

df = pd.DataFrame()
df['signal'] = [0,0,0,1,1,1,0,0,0,1]
df['PnL_binary'] = [0,0,0,1,1,1,0,0,0,0]

return_chi2(df)
>>>0.04953461343562649

So far so good, the function works and returns a chi-squared. The issue is applying it to a rolling window:

df.rolling(3).apply(return_chi2)

    return self._engine.get_value(s, k, tz=getattr(series.dtype, "tz", None))
  File "pandas\_libs\index.pyx", line 80, in pandas._libs.index.IndexEngine.get_value
  File "pandas\_libs\index.pyx", line 90, in pandas._libs.index.IndexEngine.get_value
  File "pandas\_libs\index.pyx", line 135, in pandas._libs.index.IndexEngine.get_loc
  File "pandas\_libs\index_class_helper.pxi", line 109, in pandas._libs.index.Int64Engine._check_type
KeyError: 'signal'

I believe the error is related to the apply function attempting to look for 'signal' as an index value rather than a column. I tried:

df.rolling(3).apply(return_chi2, axis=1) 

Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
TypeError: apply() got an unexpected keyword argument 'axis'

Not really sure where to go from here? I could use something like iterrows and roll through the whole df manually slicing the window but it doesn't seem very pythonic - there should be a better way to do this? Would appreciate any help to get this going?

user3062260
  • 1,584
  • 4
  • 25
  • 53
  • `type(df.rolling(3))` → `pandas.core.window.rolling.Rolling`, which expects an aggregation. – Trenton McKinney Jan 05 '21 at 22:38
  • The docs here suggest that I should be able to use a rolling apply method but they don't provide an example: https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.core.window.rolling.Rolling.apply.html – user3062260 Jan 06 '21 at 09:12

0 Answers0