I have input dataframe which has 3 columns Time, Name , Flag.
I would like to aggregate into a start and end columns where the Name
and Flag
have the same value.
Input data frame
Time | Name | Flag |
---|---|---|
5/1/2023 1:01 | Peter | 1 |
5/1/2023 1:02 | Peter | 1 |
5/1/2023 1:03 | Peter | 1 |
5/1/2023 1:04 | Peter | 0 |
5/1/2023 1:05 | Peter | 0 |
5/1/2023 1:06 | Peter | 1 |
5/1/2023 1:07 | Peter | 1 |
5/1/2023 1:08 | Peter | 1 |
5/1/2023 1:01 | John | 1 |
5/1/2023 1:02 | John | 0 |
5/1/2023 1:03 | John | 0 |
5/1/2023 1:04 | John | 0 |
5/1/2023 1:05 | John | 0 |
5/1/2023 1:06 | John | 0 |
5/1/2023 1:07 | John | 1 |
5/1/2023 1:08 | John | 1 |
5/2/2023 1:10 | Peter | 1 |
5/2/2023 1:11 | Peter | 1 |
5/2/2023 1:20 | John | 0 |
5/2/2023 1:21 | John | 0 |
5/2/2023 1:22 | John | 0 |
Output data frame
Start | End | Name | Flag |
---|---|---|---|
5/1/2023 1:01 | 5/1/2023 1:03 | Peter | 1 |
5/1/2023 1:04 | 5/1/2023 1:05 | Peter | 0 |
5/1/2023 1:06 | 5/1/2023 1:08 | Peter | 1 |
5/2/2023 1:10 | 5/2/2023 1:11 | Peter | 1 |
5/1/2023 1:01 | 5/1/2023 1:01 | John | 1 |
5/1/2023 1:02 | 5/1/2023 1:06 | John | 0 |
5/1/2023 1:07 | 5/1/2023 1:08 | John | 1 |
5/2/2023 1:20 | 5/2/2023 1:22 | John | 0 |
In this case, consecutive rows means consecutive in time.
1:08 and 1:10 is not combined because there is a gap (missing 1:09) between the rows 1:08 and 1:10
Can you please tell me how can I do that?