I'm trying to write a function which takes a data.table
, a list of columns and a list of values and selects rows such that each column is filtered by the respective value.
So, given the following data.table
:
> set.seed(1)
> dt = data.table(sample(1:5, 10, replace = TRUE),
sample(1:5, 10, replace = TRUE),
sample(1:5, 10, replace = TRUE))
> dt
V1 V2 V3
1: 1 5 5
2: 4 5 2
3: 1 2 2
4: 2 2 1
5: 5 1 4
6: 3 5 1
7: 2 5 4
8: 3 1 3
9: 3 1 2
10: 1 5 2
A call to filterDT(dt, c(V1, V3), c(1, 2))
would select the rows where V1 = 1
and V3 = 2
(rows 3 and 10 above).
My best thought was to use .SD
and .SDcols
to stand in for the desired columns and then do a comparison within i
(from dt[i,j,by]
):
> filterDT <- function(dt, columns, values) {
dt[.SD == values, , .SDcols = columns]
}
> filterDT(dt, c("V1", "V3"), c(1, 2))
Empty data.table (0 rows and 3 cols): V1,V2,V3
Unfortunately, this doesn't work, even if only filtering by one column.
I've noticed all examples of .SD
I've found online use it in j
, which tells me I'm probably doing something very wrong.
Any suggestions?