Let's take two datasets:
import pandas as pd
import numpy as np
df = pd.DataFrame([1, 2, 3, 2, 5, 4, 3, 6, 7])
check_df = pd.DataFrame([3, 2, 5, 4, 3, 6, 4, 2, 1])
I want to do the following thing:
- If any of numbers
df[0:3]
is greater thancheck_df[0]
, then we return 1 and 0 otherwise - If any of numbers
df[1:4]
is greater thancheck_df[1]
then we return 1 and 0 otherwise - And so on...
It can be done, by rolling
function and custom function:
def custom_fun(x: pd.DataFrame):
return (x > float(check_df.iloc[0])).any()
And then by combining this with apply
function:
df.rolling(3, min_periods = 3).apply(custom_fun).shift(-2)
The main problem in my solution, is that I always compare with check_df[0]
, whereas in i-th rolling window, I should compare with check_df[i]
, but I have no idea how it can be specified in the rolling function. Could you please give me a hand in this problem?