I have a data.table like
library(data.table)
ffDummy_dt = data.table(Annotation=c("chr10:10..20,-", "chr10:25..30,-"
,"chr10:35..100,-","chr10:106..205,-","chr10:223..250,-","chr10:269..478,-"
,"chr10:699..1001,-","chr10:2000..2210,-","chr10:2300..2500,-"
,"chr10:2678..5678,-"),tpmOne=c(0,0,0.213,1,1.2,0.5,0.7,0.9,0.8,0.86),
tpmTwo=c(100,1000,1001,1500,900,877,1212,1232,1312,0),tpmThree=c(0.2138595,0,0,0
,0,0,0.6415786,0,0,0))
I want to pass a query (can be vector or even a data.table if need be) like:
test_v = c(0,0,0.86)
I want to find out which row is the best match.
In my real use case, test_v is like 20 elements long and the nrow(Dummy_dt) is >>20 (but likely there will only be one perfect match per 20-element vector).
Currently,
which.max(apply(as.matrix(ffDummy_dt[,2:ncol(ffDummy_dt),with=F]), 1,
function(k) sum(test_v%in%k)))
seems to work (gives the correct output in this case, which is 10), but this is not a data.table solution.
I've had a look here but can't quite figure out how to use %in% k
above with data.table.