Obtaining minimal index with .rolling

Question

Let's consider pandas frames:

df = pd.DataFrame([1, 2, 3, 2, 5, 4, 3, 7, 2])
df_top = pd.DataFrame([1, 2, 4, 5, 2, 3, 4, 5, 1])
label_frame = pd.DataFrame([0, 0, 0, 0, 0, 0, 0, 0, 0])

I want to do the following thing:

If any of numbers df.iloc[0:3] is greater than df_top.iloc[0], then we assign to first element of label_frame minimal index for which this is satisfied.

For the first iteration it should look like this:

My program checks: df.iloc[0] > df_top.iloc[0] False, df.iloc[1] > df_top.iloc[0] True df.iloc[2] > df_top.iloc[0] True, so it should replace first element of label_frame with 1 since its the minimal index for which this inequality is satisfied.

I want to iterate this programme for whole data frame df using .rolling function combined with .apply. (so the second example should be df[1:4] > df_top[1], and we replace second element of label_frame).

Do you know how it can be done? I tried to play with a custom function, with lambda, but I have no idea how can I have rolling window of df and return minimal value of index for which the inequality is satisfied.

for i in range(len(label_frame) - 3):
    if (df.iloc[i:i+3] > df_top.iloc[i]).any()[0]:
        label_frame.iloc[i] = np.where(df.iloc[i:i+3] > df_top.iloc[i])[0].min()
label_frame.iloc[-2:, 0] = np.nan
label_frame

    0
0   1.0
1   1.0
2   2.0
3   0.0
4   0.0
5   0.0
6   0.0
7   NaN
8   NaN

Do you always want to do this with 3 elements, or many more? — mozway, Jul 21 '22 at 14:32
I would always want to go three i.e. `df[0:3] > df_top[0]` then `df[1:4] > df_top[1]`, `df[2:5] > df_top[2]` and so on... — Lucian, Jul 21 '22 at 14:33
Sure! I updated my question with the very primitive code written in an inefficient loop ;)) — Lucian, Jul 21 '22 at 14:47
probably the answer you're looking for can be found in [this post](https://stackoverflow.com/questions/55990574) i made a while back... — Ouyang Ze, Jul 22 '22 at 19:14

score 0 · Answer 1 · answered Jul 21 '22 at 14:42

0

IIUC, and if you only want to test 3 values, the easiest might be to use a 2D comparison:

a = df.assign(**{'1': df[0].shift(-1), '2': df[0].shift(-2)}).eq(df_top).to_numpy()
m = a.any(1)
label_frame[0] = df.index + np.where(m, a.argmax(1), np.nan)

output:

     0
0  0.0
1  1.0
2  NaN
3  NaN
4  NaN
5  NaN
6  NaN
7  NaN
8  NaN

answered Jul 21 '22 at 14:42

mozway

194,879
13
39
75

I updated my question by adding an exemplary desired output. Could you please chceck it if we are on the same page? – Lucian Jul 21 '22 at 15:02

Obtaining minimal index with .rolling

1 Answers1