Let's consider pandas frames:
df = pd.DataFrame([1, 2, 3, 2, 5, 4, 3, 7, 2])
df_top = pd.DataFrame([1, 2, 4, 5, 2, 3, 4, 5, 1])
label_frame = pd.DataFrame([0, 0, 0, 0, 0, 0, 0, 0, 0])
I want to do the following thing:
If any of numbers df.iloc[0:3]
is greater than df_top.iloc[0]
, then we assign to first element of label_frame
minimal index for which this is satisfied.
For the first iteration it should look like this:
My program checks: df.iloc[0] > df_top.iloc[0] False
, df.iloc[1] > df_top.iloc[0] True
df.iloc[2] > df_top.iloc[0] True
, so it should replace first element of label_frame
with 1 since its the minimal index for which this inequality is satisfied.
I want to iterate this programme for whole data frame df
using .rolling
function combined with .apply
. (so the second example should be df[1:4] > df_top[1]
, and we replace second element of label_frame).
Do you know how it can be done? I tried to play with a custom function, with lambda
, but I have no idea how can I have rolling window of df
and return minimal value of index for which the inequality is satisfied.
for i in range(len(label_frame) - 3):
if (df.iloc[i:i+3] > df_top.iloc[i]).any()[0]:
label_frame.iloc[i] = np.where(df.iloc[i:i+3] > df_top.iloc[i])[0].min()
label_frame.iloc[-2:, 0] = np.nan
label_frame
0
0 1.0
1 1.0
2 2.0
3 0.0
4 0.0
5 0.0
6 0.0
7 NaN
8 NaN