2

I would like to ask you if there is a way to filter depending on the combination of more than one variable. To be more specific:

library(dplyr)
library(plyr)
library(data.table)

data <- iris %>% cbind( group = rep(c("a", "b", "c"), nrow(iris))) %>% as.data.table()

   Sepal.Length Sepal.Width Petal.Length Petal.Width Species group
1:          5.1         3.5          1.4         0.2  setosa     a
2:          4.9         3.0          1.4         0.2  setosa     b
3:          4.7         3.2          1.3         0.2  setosa     c
4:          4.6         3.1          1.5         0.2  setosa     a
5:          5.0         3.6          1.4         0.2  setosa     b
6:          5.4         3.9          1.7         0.4  setosa     c

and i want to filter them based on the following datatable

filter <- data.table(Species = c("setosa", "versicolor", 'setosa'), group = c('a', "b", 'c'))
      Species group      filter1
1:     setosa     a     setosa a
2: versicolor     b versicolor b
3:     setosa     c     setosa c

I could do that in that way:

data[paste(Species, group) %in% filter[, filter1 := paste(Species, group)]$filter1]

However I would like to know if there is a way to do it more efficiently/faster/easier : something perhaps like:

data[.(Species, group) %in% filter] # does not work
George Sotiropoulos
  • 1,864
  • 1
  • 22
  • 32
  • 1
    @Jaap I guess the link is for a more complicated filtering operation, like `on=.(x = x, y != y)`. Here, I think `data[filter, on=names(filter), nomatch=0]` is probably the target, or maybe https://stackoverflow.com/questions/18969420/perform-a-semi-join-with-data-table – Frank Oct 11 '17 at 17:25
  • Yes, indeed @Frank answers my question and also answers to what I was looking to. Because as I state I was searching a more elegant and easy way to do it. The answer of Frank is sufficient, if you please write it as an answer, then I can accept it. – George Sotiropoulos Oct 12 '17 at 10:13

1 Answers1

4

In this case, you can do

data[filter, on=names(filter), nomatch=0]

See Perform a semi-join with data.table for similar filtering joins.

Frank
  • 66,179
  • 8
  • 96
  • 180