0

I have the following data.table,

   id time event
1:  1    1     1
2:  1    2     0
3:  2    1     0
4:  2    2     1

# Generate data;
library(data.table)

data <- data.table(
    id = c(1,1,2,2),
    time = c(1,2,1,2),
    event = c(1,0,0,1)
)

I want to retain all rows up until event == 1, such that the data looks like

   id time event
1:  1    1     1
2:  2    1     0
3:  2    2     1

I'm looking for a data.table solution.

r2evans
  • 141,215
  • 6
  • 77
  • 149
Serkan
  • 1,855
  • 6
  • 20

1 Answers1

1
data[, .SD[cumsum(shift(event, type = "lag", fill = 0) > 0) < 1,], by = id]
#       id  time event
#    <num> <num> <num>
# 1:     1     1     1
# 2:     2     1     0
# 3:     2     2     1

The use of cumsum(.) < 1 is effectively "keep until the result is true" (since sum of true is 1+, so cumulative sum of all-false is still 0). However, since we need to keep the current row when we first see event == 1, we need to work on the lagged value using shift(event, type="lag"). The default of shift in this case for the first row of each group is NA which is not going to work, so we use fill=0 to mimic having a non-1 preceding event.

r2evans
  • 141,215
  • 6
  • 77
  • 149
  • 1
    Or using `which.max` and `.I`: `data[data[ , .I[seq.int(which.max(event == 1))], by = id]$V1]`; see e.g. [Subset all rows before negative value in group](https://stackoverflow.com/questions/35575022/subset-all-rows-before-negative-value-in-group/35575307#35575307) – Henrik May 29 '22 at 11:37