7

This same question was asked here and marked as a duplicate. However, it is not a duplicate and received no answers. I'm asking again.

I have

df = data.frame(A=1:10, B=sample(c('TT', 'TG', 'GG'), 10, replace=T))
# df
#    A  B
#1   1 TG
#2   2 TG
#3   3 GG
#4   4 TT
#5   5 TT
#6   6 TT
#7   7 GG
#8   8 TT
#9   9 TG
#10 10 TT

If I specify the column I can use a dynamic list of values like:

> vals=c('TT', 'GG')
> df%>% filter(B %in% !!vals)
   A  B
1  3 GG
2  4 TT
3  5 TT
4  6 TT
5  7 GG
6  8 TT
7 10 TT

Now I want to add in col='B' to do something like:

df%>% filter(!!col %in% !!vals)
[1] A B
<0 rows> (or 0-length row.names)

Using

> paste(col, "==", sapply(vals, function(x){paste0("'", x, "'")}), collapse=" | ")
[1] "B == 'TT' | B == 'GG'"

The following monstrosity does work:

> df %>% filter_(paste(col, "==", sapply(vals, function(x){paste0("'", x, "'")}), collapse=" | "))
   A  B
1  3 GG
2  4 TT
3  5 TT
4  6 TT
5  7 GG
6  8 TT
7 10 TT

I'm really hoping there is a simple, dplyr-eseq syntax for this.

abalter
  • 9,663
  • 17
  • 90
  • 145

1 Answers1

8

Following the tidy evaluation syntax, use:

df %>% filter(!!sym(col) %in% !!vals)

sym() converts your string to a symbol, which dplyr knows to evaluate.

Also df %>% filter(!!as.name(col) %in% !!vals) works as @A.Suliman points out.

Mikko
  • 7,530
  • 8
  • 55
  • 92