I have a dataframe,df after converting DateTimeC column to a datetime type such as:
Index DateTimeC eventName
0 2017-08-20 01:11:24.210000 ABC
1 2017-08-20 01:11:30.224000 CDE
2 2017-08-20 02:16:30.210000 CDE
3 2017-08-20 02:27:30.211000 CDE
2 2017-09-10 01:30:40.212000 DEF
3 2017-09-11 01:35:23.122000 CDE
4 2017-09-11 02:22:22.145000 CDE
5 2017-09-16 02:26:11.222000 DEF
I intend to groupby month and hour and count the count of events in eventName within the grouped object.So applying this code:
df2=df.groupby([df['DateTimeC'].dt.month,df['DateTimeC'].dt.hour])['EventName'].count()
I get:
Index EventName
8,1 2
8,2 2
9,1 2
9,2 2
However, i want to use the pivot_table () on the resulting series as month as the index and hour as the column while the value parameter should be the frequency. So the resulting dataframe should be:
Index 0 1 2 3 4 5 6 7 8... 24
8 0 2 2 0 0 0 0 0 0... 0
9 0 2 2 0 0 0 0 0 0... 0
So what will be the corresponding argument for the parameters of the pivot_table () since the date and time are in the same column: DateTimeC
I tried to add the rename_index to rename the column for the frequency/count result so I can pass the new name to the 'value' parameter in the pivot_table () using this code:
df2=df.groupby([df['DateTimeC'].dt.month,df['DateTimeC'].dt.hour])['EventName'].count().reset_index(name='frequency')
but i get this error:
ValueError: cannot insert DateTimeC, already exists
Also, get the mean per hour for each month and transform the number associated with a specific month to its word equivalent
Index averagePerHour
August 0.17
September 0.17