Imagine a snippet of the follow data frame:
ID ActivityName Time Type Shape
1 1 Request 0.000 Type_1 767
2 1 Request 600.000 Type_1 767
3 1 Start 600.000 Type_1 1376
4 1 Start 600.000 Type_1 1376
5 1 Delay 2962.295 Type_1 1022
6 1 Schedule Activities 600.000 Type_1 15
7 1 Schedule Activities 2062.295 Type_1 15
What I'm trying to do is to create two new columns based on the repeating entries in ActivityName.
Specifically, I want to combine two subsequent rows for the same activity into one row with a start and complete timestamp (from Time, in seconds).
Given that not all entries in ActivityName have a matching second entry (however, max two consecutive entries are identical), I would also like to delete such rows, as Delay in this case.
It should then look something like:
ID ActivityName StartTime EndTime Type Shape
1 1 Request 0.000 600.000 Type_1 767
2 1 Start 600.000 600.000 Type_1 1375
3 1 Schedule Activities 600.000 2062.295 Type_1 15
All categories in ActivityName occur many times in that column. I hope not to compare their associated Time not across the whole column, only those who have two consecutive identical occurrence.
Any ideas for how to go about this will be highly appreciated.