In a pandas time-series, I am trying to find a combination measure of a threshold with a duration.
For instance, we want the number of periods > 5 minutes, where column ['pct'] is below 80
The dataframe looks like this:
timestamp | pct |
---|---|
27-05-2021 10:11 | 95 |
27-05-2021 10:12 | 94 |
27-05-2021 10:13 | 80 |
27-05-2021 10:14 | 94 |
27-05-2021 10:15 | 80 |
27-05-2021 10:16 | 80 |
27-05-2021 10:17 | 80 |
27-05-2021 10:18 | 80 |
27-05-2021 10:19 | 80 |
27-05-2021 10:20 | 91 |
27-05-2021 10:21 | NaN |
27-05-2021 10:22 | 80 |
27-05-2021 10:23 | 80 |
27-05-2021 10:24 | 80 |
27-05-2021 10:25 | 80 |
27-05-2021 10:26 | 94 |
It would thus need to identify 1 period(as we do not care to include NaN values)
I've gotten some of the way with the post from Ben B, and the answer from Alain T here: How to count consecutive periods in a timeseries above/below threshold?
I've attached an ugly image from microsoft paint to illustrate the problem
NB: It is quite a big dataframe, so I am not sure that iterating over the dataframe is the best idea, but any help is very much appreciated.