I have a data frame that looks like this:
id | week | day_of_week | time_of_action |
---|---|---|---|
id1 | 1 | 1 | 22/07/21 08:00:00 |
id1 | 1 | 2 | 23/07/21 08:00:00 |
id1 | 1 | 2 | 23/07/21 08:10:00 |
id1 | 1 | 2 | 23/07/21 18:00:00 |
id1 | 2 | 2 | 23/07/21 10:00:00 |
id1 | 2 | 2 | 31/07/21 11:00:00 |
id1 | 2 | 4 | 31/07/21 12:00:00 |
id1 | 2 | 4 | 31/07/21 14:00:00 |
id1 | 2 | 4 | 31/07/21 15:00:00 |
id1 | 3 | 1 | 02/08/21 14:00:00 |
id1 | 3 | 2 | 03/08/21 15:00:00 |
I need to count the number of actions, based of count of time of actions, and group for each ID for each week, then find the day of each week with max actions and count how many actions. If a week has multiple max (like for week 3) so chose one of the max randomly.
The output I need looks like this:
id | week | max_number_actions | max_actions_day |
---|---|---|---|
id1 | 1 | 3 | 2 |
id1 | 2 | 3 | 4 |
id1 | 3 | 1 | 1 |
this is the code I used:
maxmin_com=df1.groupby(['id','week','day_of_week']).agg({'time_of_action':['count']}).reset_index()
max_com=maxmin_com.groupby(['id','week']).agg(**{'max_number_actions': (('time_of_action', 'count'), 'max'),'max_actions_day':(('day_of_week'),'first')})
After checking the output I realize that the use of 'first' is not what I need. I also tried this code:
maxmin_com.loc[maxmin_com.groupby(['id','week'])[('time_of_action', 'count')].idxmax()]
but than I get this error:
other must be a MultiIndex or a list of tuples
Can you please advice?
Thanks!