0

I'm I have a df with a series of points per time and I want to group then into buckets for each hour of the day (from 00:00:00 to 24:00:00)

here is oen piece of the df that I call dfH:

     Hora de início Rodada
00:00:00     636
00:00:07    1184
00:00:09     680
00:00:23     651
00:00:30     539
00:01:16    1076
00:01:44     925
00:02:00     229
00:02:48     452
00:03:06    1143
00:03:55     401
00:04:10    1148
00:04:20     677
00:04:26     552
00:05:10    1182
00:05:44     677
00:06:03     657
00:06:23    1172
00:06:34     428
00:06:59     662
00:07:05    1131
00:07:30     675
00:07:53    1175
00:08:06    1121
00:08:33     564
00:08:43     673
00:08:45     670
00:09:06    1014
00:09:17     449
00:09:19    1156
Name: (TOTAL ESTRELAS, TOTAL), dtype: int64

I'm trying:

bins = np.arange(0, 24, 1)

groups = dfH.groupby(pd.cut(dfH,bins)).sum()

but then I get:

(TOTAL ESTRELAS, TOTAL)
(0, 1]      0
(1, 2]      0
(2, 3]      0
(3, 4]      0
(4, 5]      0
(5, 6]      0
(6, 7]      0
(7, 8]      0
(8, 9]      0
(9, 10]     0
(10, 11]    0
(11, 12]    0
(12, 13]    0
(13, 14]    0
(14, 15]    0
(15, 16]    0
(16, 17]    0
(17, 18]    0
(18, 19]    0
(19, 20]    0
(20, 21]    0
(21, 22]    0
(22, 23]    0
Name: (TOTAL ESTRELAS, TOTAL), dtype: int64

Maybe the index format is not in hour format so I tried:

dfH.index = pd.to_datetime(dfH.index, format = '%H:%M:%S').dtype.hour

But then I got the error:

ValueError: time data 'TOTAL' does not match format '%H:%M:%S' (match)

1 Answers1

0

try doing:

dfH.resample("1h").sum()

that being if your index is a datetime

Ayoub ZAROU
  • 2,387
  • 6
  • 20