1

I have the following dataframe :

       id  day   operation_time
0          142    0       08:30:00
1          142    0       08:50:00
2          142    0       08:51:00
3          142    0       09:00:00
4          142    0       09:10:00
5          142    0       09:20:00
6          142    0       09:21:00
7          142    1       08:41:00
8          142    1       08:50:00
9          142    1       08:51:00
10         142    2       08:41:00 
11         142    2       08:50:00
12         142    2       08:51:00
13         142    3       08:41:00
14         142    3       08:50:00
15         142    3       08:51:00
...

I want to group the dataframe by id and day, so I'll just have unique days for each id, however I want to keep the latest operation time for each id-day group. How can I do this?

I've tried using iterrows, but I think that it's not the most practical way of doing this.

I want the dataframe to look like this:

           id  day operation_time
1         142    0       09:21:00
2         142    1       08:51:00
3         142    2       08:51:00
4         142    3       08:51:00
5         142    4       08:51:00
6         142    5       09:30:00
...
George Gibbs
  • 51
  • 1
  • 7
  • 1
    Use `df.loc[df.groupby(["id", "day"])["operation_time"].idxmax()]` – jezrael Oct 28 '19 at 07:22
  • 1
    That did it! I tried using idxmax too, but grouping like this: groupby(["id", "day","operation_time"]) and without using loc,to no avail . Thanks a lot, also thanks for referring me to the other post , I saw several posts but never saw that one because I didn't know how to describe what I was trying to do without the example. – George Gibbs Oct 28 '19 at 07:33

0 Answers0