1

I'm looking for a way to easily do a Pandas aggregation across a rolling window. For example, 90-day rolling daily, or 12-month rolling monthly.

If I want to calculate a monthly average, I can very easily do something like data.groupby([pd.Grouper(freq='M')]).agg({'value': 'mean', 'count': 'sum'}).

However, in this case, I want to calculate the average over all observations in the preceding year period, reported monthly (or 30- or 90-day period, aggregated daily). I don't want to average the averages (e.g., from a resample), as some aggregations like geometric means would not be calculated correctly.

Is there a slick, idiomatic way of doing this? Everything I've done so far has been a few dozen lines of code duplicating a section of the data frame for each chunk, and I feel like there has to be something cleaner/faster.

Thanks!

terry87
  • 445
  • 1
  • 4
  • 15
  • 1
    Have you looked at `DataFrame.rolling`? https://pandas.pydata.org/pandas-docs/stable/generated/pandas.DataFrame.rolling.html – PMende Sep 08 '18 at 00:07
  • Yes, but I didn't see a way to do it monthly – terry87 Sep 09 '18 at 03:12
  • I believe you simply need to use a window of `'1M'`. – PMende Sep 09 '18 at 21:05
  • 'M' is not supported. It throws `ValueError: is a non-fixed frequency`. – terry87 Sep 10 '18 at 16:27
  • Just to be explicit, `1M` doesn't work either. I'm assuming you're talking about doing something like `print(data.rolling('1M', center=False).agg({'value': gmean}))` – terry87 Sep 10 '18 at 22:23
  • I suppose it makes sense that month is not supported. Try using '30D' or something similar for whatever number of days you'd like to specify. – PMende Sep 10 '18 at 22:57
  • Unfortunately that's not good enough - for this case I need month/quarter/year. :-( – terry87 Sep 12 '18 at 03:27

0 Answers0