IPython / pandas: Is there an canonical way to detect rapid changes in a timeseries?

Question

Noob data analyst, analyzing some gas concentrations over a timeseries of a couple of thousand points (so small). I graphed it with Matplotlib, and there are some easy to see points where things change rapidly.

What is the canonical / easiest way to home in on those points?

Do you mean comparing values against previous value? `diff()` shows the difference between previous rows if that's any help — EdChum, Feb 19 '15 at 21:52
I am comparing values to earlier values in the time series. Say comparing n with n-10. — Dirk, Feb 20 '15 at 01:06
Yeah, like Ed said. Check out diff(). Maybe filter on the bigger values to slim down what you're looking at. There's also rolling_mean that could help identify more sustained spikes — Bob Haffner, Feb 20 '15 at 03:12
like Bob said, rolling_mean of the diff, and I'd spend some time with the window size for rolling_mean while deciding what I meant by "rapidly". — cphlewis, Feb 20 '15 at 03:24

score 2 · Accepted Answer · edited May 23 '17 at 12:22

import pandas as pd
from numpy import diff, concatenate
ff = pd.DataFrame( #acquire data here
      columns=('Year','Recon'))
fd = diff(ff['Recon'], axis=-1)
ff['diff'] = concatenate([[0],fd],axis=0)
ff['rolling10'] = pd.rolling_mean(ff['diff'],10)
ff['rolling5'] = pd.rolling_mean(ff['diff'],5)
ff.plot('Year',['rolling5','rolling10'],subplots=False)

But note! my test data was evenly sampled. Looks like rolling_* don't apply to irregular time series yet, though there are some workarounds: Pandas: rolling mean by time interval

IPython / pandas: Is there an canonical way to detect rapid changes in a timeseries?

1 Answers1