I have a dataset that looks like the following:
ID Cond Time1 Time2
1 2 Start Stop1
1 3 Start abc
1 1 abc Stop2
1 2 Start abc
1 2 abc Stop1
2 2 Start abc
2 4 abc jkl
2 3 abc jkl
2 2 abc jkl
2 3 abc Stop2
3 2 Start abc
3 3 abc Stop2
3 2 Start Stop1
3 3 Start Stop1
3 3 Start abc
3 2 abc jkl
3 4 baba Stop1
4 2 Start Stop2
4 1 Start asd
4 2 abc Stop2
And I need to filter the data based on a couple of criteria. If Cond = 2
and Time1 = Start
, and I need to filter until the first stopping point (either Stop1
or Stop2
). Essentially, it should look like this:
ID Cond Time1 Time2
1 2 Start Stop1
1 2 Start abc
1 2 abc Stop1
2 2 Start abc
2 4 abc jkl
2 3 abc jkl
2 2 abc jkl
2 3 abc Stop2
3 2 Start abc
3 3 abc Stop2
3 2 Start Stop1
4 2 Start Stop2
Also, the real dataset has over 140,000 observations, so efficienicy is key. I was thinking about using the dplyr
package, but not sure how to go about this problem.