6

I have a Series with values indexed by timestamps. These timestamps are irregularly spaced and I would like to calculate something like the rolling mean (say) over the last N seconds, where N is a constant. Unfortunately, resampling at regular intervals before calculating the rolling quantity is NOT an option - the rolling quantity has to be calculated on the entire dataset.

Is there a good way to do this in pandas?

Jk1
  • 11,233
  • 9
  • 54
  • 64
wfh
  • 1,553
  • 3
  • 12
  • 10
  • possible duplicate of [Pandas: rolling mean by time interval](http://stackoverflow.com/questions/15771472/pandas-rolling-mean-by-time-interval) – John Zwinck Jun 21 '14 at 04:45

1 Answers1

3

You want to reset your index to an integer index and perform the rolling operation on a timestamp column.

# generate some data
data = pd.DataFrame(data={'vals':range(5), 'seed_ts': [np.datetime64('2017-04-13T09:00:00') for x in range(5)]})
data['random_offset'] = [np.timedelta64(randint(0, 5), 's') for x in range(5)]
data['cum_time'] = data['random_offset'].cumsum()
data['ts'] = data['seed_ts'] + data['cum_time']
data.index = data['ts']
data = data[['vals']]

Reset the index:

data = data.reset_index()

Compute the rolling mean over the past 5 seconds:

data['rolling_mean'] = data.rolling('5s', on='ts')['vals'].mean()
Abigail
  • 76
  • 3