I have a data.table and I'm trying to subset it so I get the lowest TEST_RESULT by ID and LAB_DT:
DT <- data.table::data.table(ID=c("1","1","1","2","2","3","3","3"),
LAB_DT=lubridate::as_date(c("1992-11-01",
"1992-11-01",
"1992-11-02",
"1992-11-04",
"1992-11-04",
"1992-11-06",
"1992-11-06",
"1992-11-08")),
TEST_RESULT=c(5.4,5.8,5.2,5.6,6,6,7,8))
setkeyv(DT,c("ID","LAB_DT","TEST_RESULT"))
ID LAB_DT TEST_RESULT
1: 1 1992-11-01 5.4
2: 1 1992-11-01 5.8
3: 1 1992-11-02 5.2
4: 2 1992-11-04 5.6
5: 2 1992-11-04 6.0
6: 3 1992-11-06 6.0
7: 3 1992-11-06 7.0
8: 3 1992-11-08 8.0
I have already successfully done this via my own method:
DT[,FIRST.LAB_DT:=0]
DT[, FIRST.LAB_DT := c(1L, FIRST.LAB_DT[-1]), by = .(ID,LAB_DT)]
ID LAB_DT TEST_RESULT FIRST.LAB_DT
1: 1 1992-11-01 5.4 1
2: 1 1992-11-01 5.8 0
3: 1 1992-11-02 5.2 1
4: 2 1992-11-04 5.6 1
5: 2 1992-11-04 6.0 0
6: 3 1992-11-06 6.0 1
7: 3 1992-11-06 7.0 0
8: 3 1992-11-08 8.0 1
DT[FIRST.LAB_DT==1,]
ID LAB_DT TEST_RESULT FIRST.LAB_DT
1: 1 1992-11-01 5.4 1
2: 1 1992-11-02 5.2 1
3: 2 1992-11-04 5.6 1
4: 3 1992-11-06 6.0 1
5: 3 1992-11-08 8.0 1
However, the actual data.table I'm working with has ~8e6 rows and I recently found that you can subset data.table using the keys in a much faster manner.
Would anyone know how to produce the same final output using keys?