1

To find the maximum over the last 300 seconds:

import pandas as pd

# 16 to 17 minutes of time-series data.
df = pd.DataFrame(range(10000))
df.index = pd.date_range(1, 1000000000000, 10000)

# maximum over last 300 seconds. (outputs 9999)
df[0].rolling('300s').max().tail(1)

How can I exclude the most recent 30s from the rolling calculation? I went the max between -300s and -30s.

So, instead of 9999 being outputted by the above, I want something like 9700 (thereabouts) to be displayed.

Jase
  • 1,025
  • 1
  • 9
  • 34
  • 1
    Can you please post an [MRE](https://stackoverflow.com/help/minimal-reproducible-example) with sample input and expected output? Also see [How to make good reproducible pandas examples](https://stackoverflow.com/questions/20109391). – AlexK Sep 09 '22 at 00:08

1 Answers1

1

You can compute the rolling max for the last 271 seconds (271 instead of 270 if you need that 300th second included), then shift the results by 30 seconds, and merge them with the original dataframe. Since in your example the index is at the sub-second level, you will need to utilize merge_asof to find the desired matches (you can use the direction parameter of that function to select non-exact matches).

import pandas as pd

# 16 minutes and 40 seconds of time-series data.
df = pd.DataFrame(range(10_000))
df.index = pd.date_range(1, 1_000_000_000_000, 10_000)

roll_max = df[0].rolling('271s').max().shift(30, freq='s').rename('roll_max')

res = pd.merge_asof(df, roll_max, left_index=True, right_index=True)

print(res.tail(1))
#                         0  roll_max
# 1970-01-01 00:16:40  9999    9699.0
AlexK
  • 2,855
  • 9
  • 16
  • 27