1

I have a dataframe nf as follows :

    StationID   DateTime    Channel Count
0   1   2017-10-01 00:00:00 1   1
1   1   2017-10-01 00:00:00 1   201
2   1   2017-10-01 00:00:00 1   8
3   1   2017-10-01 00:00:00 1   2
4   1   2017-10-01 00:00:00 1   0
5   1   2017-10-01 00:00:00 1   0
6   1   2017-10-01 00:00:00 1   0
7   1   2017-10-01 00:00:00 1   0

.......... and so on I want to groupby values by each hour and for each channel and StationID

Output Req

Station ID DateTime       Channel    Count  
1   2017-10-01 00:00:00    1          232
1   2017-10-01 00:01:00    1          23
2   2017-10-01 00:00:00    1          244...

...... and so on

Nik
  • 43
  • 8
  • Hi, It is not a duplicate as the values I need should be also grouped by each hour . – Nik Mar 15 '18 at 15:33

2 Answers2

1

I think you need groupby with aggregate sum, for datetimes with floor by hours add floor - it set minutes and seconds to 0:

print (df)
   StationID             DateTime  Channel  Count
0          1  2017-12-01 00:00:00        1      1
1          1  2017-12-01 00:00:00        1    201
2          1  2017-12-01 00:10:00        1      8
3          1  2017-12-01 10:00:00        1      2
4          1  2017-10-01 10:50:00        1      0
5          1  2017-10-01 10:20:00        1      5
6          1  2017-10-01 08:10:00        1      4
7          1  2017-10-01 08:00:00        1      1

df['DateTime'] = pd.to_datetime(df['DateTime'])

df1 = (df.groupby(['StationID', df['DateTime'].dt.floor('H'), 'Channel'])['Count']
        .sum()
        .reset_index() 
        )
print (df1)
   StationID            DateTime  Channel  Count
0          1 2017-10-01 08:00:00        1      5
1          1 2017-10-01 10:00:00        1      5
2          1 2017-12-01 00:00:00        1    210
3          1 2017-12-01 10:00:00        1      2

print (df['DateTime'].dt.floor('H'))
0   2017-12-01 00:00:00
1   2017-12-01 00:00:00
2   2017-12-01 00:00:00
3   2017-12-01 10:00:00
4   2017-10-01 10:00:00
5   2017-10-01 10:00:00
6   2017-10-01 08:00:00
7   2017-10-01 08:00:00
Name: DateTime, dtype: datetime64[ns]

But if dates are not important, only hours use hour:

df2 = (df.groupby(['StationID', df['DateTime'].dt.hour, 'Channel'])['Count']
        .sum()
        .reset_index() 
        )
print (df2)
   StationID  DateTime  Channel  Count
0          1         0        1    210
1          1         8        1      5
2          1        10        1      7
jezrael
  • 822,522
  • 95
  • 1,334
  • 1,252
0

Or you can use Grouper:

df.groupby(pd.Grouper(key='DateTime', freq='"H'), 'Channel', 'StationID')['Count'].sum()
Jordi
  • 1,313
  • 8
  • 13