0

I want to select 3 residual data that only pass through the threshold in a row, where my threshold is 3. Here I attach the csv data to the link and what I currently do is for the filter. where I need the time criteria there. Consecutive data are those that pass the threshold and are sequentially timed

df[df.residual_value >= 3]

Data csv

Data

after residual < 3

519M4
  • 17
  • 4
  • Do you need [this](https://stackoverflow.com/questions/20069009/pandas-get-topmost-n-records-within-each-group) ? – jezrael Aug 19 '21 at 10:03
  • @jezrael what about the timing? because I think 3 consecutive data that passes through the threshold are consecutive data according to time – 519M4 Aug 19 '21 at 10:18

1 Answers1

0

IIUC, you want to filter the rows that are greater or equal than 3, only if 3 consecutive rows match the criterion. You can use rolling+min:

processing:

df[df['col'].rolling(window=3).min().shift(-2).ge(3)]

example dataset:

np.random.seed(0)
df = pd.DataFrame({'col': np.random.randint(0,10,100)})
>>> df.head(15)
    col
0     5
1     0
2     3
3     3
4     7
5     9
6     3
7     5
8     2
9     4
10    7
11    6
12    8
13    8
14    1

output:

    col
2     3
3     3
4     7
5     9
9     4
10    7
11    6
...
mozway
  • 194,879
  • 13
  • 39
  • 75