I have a dataframe grouped by bikeid and sorted by time. If type repeats consecutively, I want to keep the earliest time. In the case below, I want to remove line 17, 19,33,39 and 41
subtract value from previous row by group This will get what I need once I removed the duplicates.
bikeid type time
1 1004 repair_time 2019-04-04 14:07:00
3 1004 red_time 2019-04-19 00:54:56
8 1004 repair_time 2019-04-19 12:47:00
10 1004 red_time 2019-04-19 16:45:18
15 1004 repair_time 2019-04-20 04:42:00
17 1004 repair_time 2019-04-20 05:29:00
19 1004 repair_time 2019-04-28 07:33:00
27 1010 repair_time 2019-04-20 10:05:00
29 1010 red_time 2019-04-22 20:51:21
33 1010 red_time 2019-04-23 11:02:34
37 1010 repair_time 2019-04-24 17:20:00
39 1010 repair_time 2019-04-24 18:30:00
41 1010 repair_time 2019-04-24 18:42:00
The final result should look this this:
bikeid type time
1 1004 repair_time 2019-04-04 14:07:00
3 1004 red_time 2019-04-19 00:54:56
8 1004 repair_time 2019-04-19 12:47:00
10 1004 red_time 2019-04-19 16:45:18
15 1004 repair_time 2019-04-20 04:42:00
27 1010 repair_time 2019-04-20 10:05:00
29 1010 red_time 2019-04-22 20:51:21
37 1010 repair_time 2019-04-24 17:20:00