1

Consider a csv file:

customer    consumption     datetime
1           0.970           2013-06-29 19:00:00
1           0.625           2013-06-29 19:30:00
1           0.153           2013-06-29 20:00:00
1           0.484           2013-06-29 20:30:00
1           0.489           2013-06-29 21:00:00
1           0.970           2013-06-30 19:00:00
1           0.625           2013-06-30 19:30:00
1           0.153           2013-06-30 20:00:00
1           0.484           2013-06-30 20:30:00
1           0.489           2013-06-30 21:00:00
2           0.461           2013-06-29 19:00:00
2           0.894           2013-06-29 19:30:00
2           0.848           2013-06-29 20:00:00
2           0.977           2013-06-29 20:30:00
2           0.189           2013-06-29 21:00:00
2           0.461           2013-06-30 19:00:00
2           0.894           2013-06-30 19:30:00
2           0.848           2013-06-30 20:00:00
2           0.977           2013-06-30 20:30:00
2           0.189           2013-06-30 21:00:00

I want to aggregate(mean) consumption for each customer for each day. I can easily aggregate for each day using:

df.resample('D').mean()

But that aggregates data for all customer, instead I want to aggregate consumption for each customer on daily basis. I went through most of the articles posted (here) but they all aggregate based on date only.

JKL
  • 27
  • 7

1 Answers1

3

I think for simultaneously grouping is necessary pass Grouper to groupby:

df['datetime'] = pd.to_datetime(df['datetime'])
df1 = (df.groupby(['customer', pd.Grouper(freq='D', key='datetime')])['consumption']
         .mean()
         .reset_index())
print (df1)
   customer   datetime  consumption
0         1 2013-06-29       0.5442
1         1 2013-06-30       0.5442
2         2 2013-06-29       0.6738
3         2 2013-06-30       0.6738
jezrael
  • 822,522
  • 95
  • 1,334
  • 1,252