10

I have a multiIndex pandas dataframe, where the first level index is a group and the second level index is time. What I want to do is, within each group, to resample to daily frequency taking the average of intraday observations.

import pandas as pd
import numpy as np

data = pd.concat([pd.DataFrame([['A']*72, list(pd.date_range('1/1/2011', periods=72, freq='H')), list(np.random.rand(72))], index = ['Group', 'Time', 'Value']).T,
                  pd.DataFrame([['B']*72, list(pd.date_range('1/1/2011', periods=72, freq='H')), list(np.random.rand(72))], index = ['Group', 'Time', 'Value']).T,
                  pd.DataFrame([['C']*72, list(pd.date_range('1/1/2011', periods=72, freq='H')), list(np.random.rand(72))], index = ['Group', 'Time', 'Value']).T],
                  axis = 0).set_index(['Group', 'Time'])

This is what I tried so far:

daily_counts = data.groupby(pd.TimeGrouper('D'), level = ['Time']).mean()

But I get the following error:

TypeError: Only valid with DatetimeIndex, TimedeltaIndex or PeriodIndex, but got an instance of 'MultiIndex'

Any idea how to solve this?

FLab
  • 7,136
  • 5
  • 36
  • 69

1 Answers1

12

You need first cast column to float and then use Grouper:

data['Value'] = data['Value'].astype(float)
daily_counts = data.groupby([pd.Grouper(freq='D', level='Time'), 
                             pd.Grouper(level='Group')])['Value'].mean()

print (daily_counts) 
Time        Group
2011-01-01  A        0.548358
            B        0.612878
            C        0.544822
2011-01-02  A        0.529880
            B        0.437062
            C        0.388626
2011-01-03  A        0.563854
            B        0.479299
            C        0.557190
Name: Value, dtype: float64

Another solution:

data = data.reset_index(level='Group')
print (data.groupby('Group').resample('D')['Value'].mean())
jezrael
  • 822,522
  • 95
  • 1,334
  • 1,252
  • Thanks a lot this solved my problem. So I guess the main take-away is that when I want to groupby on a MultiIndex I still need to pass all the index levels as groupers. Is it fair? – FLab Jan 05 '17 at 11:58
  • Yes, but maybe is more common use second solution, see [here](http://pandas.pydata.org/pandas-docs/stable/whatsnew.html#groupby-syntax-with-window-and-resample-operations). – jezrael Jan 05 '17 at 12:02
  • On your second solution, worth highlighting this bug (fixed in pandas 0.19) that might prevent to use kwargs in resample: https://github.com/pandas-dev/pandas/issues/13235 – FLab Jan 06 '17 at 12:17
  • 1
    Note that `pd.TimeGrouper()` is deprecated in favour of `pd.Grouper()` with the `freq` argument set. – fpersyn May 10 '19 at 11:59