127

I'm using mtcars dataset to illustrate my question.

For example, I want to subset data to 4-cyl cars.I can do:

mtcars %>% filter(cyl == 4)

In my work, I need to pass a string variable as my column name. For example:

var <- 'cyl'
mtcars %>% filter(var == 4)

I also did:

mtcars %>% filter(!!var == 4)

In both cases, I got empty dataframe.

zx8754
  • 52,746
  • 12
  • 114
  • 209
zesla
  • 11,155
  • 16
  • 82
  • 147
  • 1
    Does this answer your question? [Filter data frame by character column name (in dplyr)](https://stackoverflow.com/questions/27197617/filter-data-frame-by-character-column-name-in-dplyr) – camille Dec 21 '19 at 04:36

4 Answers4

139

!! or UQ evaluates the variable, so mtcars %>% filter(!!var == 4) is the same as mtcars %>% filter('cyl' == 4) where the condition always evaluates to false; You can prove this by printing !!var in the filter function:

mtcars %>% filter({ print(!!var); (!!var) == 4 })
# [1] "cyl"
#  [1] mpg  cyl  disp hp   drat wt   qsec vs   am   gear carb
# <0 rows> (or 0-length row.names)

To evaluate var to the cyl column, you need to convert var to a symbol of cyl first, then evaluate the symbol cyl to a column:

Using rlang:

library(rlang)
var <- 'cyl'
mtcars %>% filter((!!sym(var)) == 4)

#    mpg cyl  disp  hp drat    wt  qsec vs am gear carb
#1  22.8   4 108.0  93 3.85 2.320 18.61  1  1    4    1
#2  24.4   4 146.7  62 3.69 3.190 20.00  1  0    4    2
#3  22.8   4 140.8  95 3.92 3.150 22.90  1  0    4    2
# ...

Or use as.symbol/as.name from baseR:

mtcars %>% filter((!!as.symbol(var)) == 4)

mtcars %>% filter((!!as.name(var)) == 4)
Psidom
  • 209,562
  • 33
  • 339
  • 356
  • 2
    how can i do the same thing, but for a list of strings (that eventually corresponds to a list of column names) – ifreak Jul 04 '18 at 07:34
  • @Psidom None of those methods works on my installation (using R 3.5), using the example you gave. The "sym" method returns a matrix of the same size as mtcars, but with all zeros. The "as.symbol" and "as.name" methods both return "invalid argument type" errors. Any idea what's going on? – DangerousDave Sep 10 '18 at 00:20
  • 29
    This is crazy if you need another package to do this in dplyr, what's wrong with `mtcars %>% filter(get(var) == 4)`? – s_baldur Oct 08 '18 at 08:40
  • 3
    Not sure if anything is wrong with using `get()`, but dplyr does already import rlang, so it will already be installed. – r_alanb Dec 21 '18 at 22:47
  • 8
    get() is simpler – tef2128 Apr 25 '19 at 14:42
  • @ifreak - For a list of string, I don't know the tidyverse solution, but you can use a loop: `for(i in 1:length(x$colWithVarNames)) x$result[i] <- x[[x$colWithVarNames[i]]][i]` – user3799203 Feb 25 '21 at 14:39
65

I think @snoram's answer is elegant and is dependent solely on dplyr.

var <- c('cyl')
mtcars %>% filter(get(var) == 4)

You can also use this with a list. For a simple example, you can get a count of each filtered column as a new dataset.

#adding car name
mtcars <- rownames_to_column(mtcars, "car_name")

#name your vectors
vector <- c("vs","am","carb")

df2 <- data.frame()
for (variable in vector) {
  df1 <- mtcars %>% filter(get(variable) == 1) %>% summarise(variable = n_distinct(car_name)) %>% data.frame()

  df2<- rbind(df2,df1)
}
daszlosek
  • 1,366
  • 10
  • 19
  • 1
    For anyone that find with: `Error in rownames_to_column(mtcars, "car_name")` import: `library(tibble)` or `library(tidyverse)` rownames_to_column() is a function from tibble. – rubengavidia0x Feb 07 '22 at 22:28
41

It is now recommended to use .data pronoun :

library(dplyr)

mtcars %>% filter(.data[[var]] == 4)

#                mpg cyl  disp  hp drat    wt  qsec vs am gear carb
#Datsun 710     22.8   4 108.0  93 3.85 2.320 18.61  1  1    4    1
#Merc 240D      24.4   4 146.7  62 3.69 3.190 20.00  1  0    4    2
#Merc 230       22.8   4 140.8  95 3.92 3.150 22.90  1  0    4    2
#Fiat 128       32.4   4  78.7  66 4.08 2.200 19.47  1  1    4    1
#Honda Civic    30.4   4  75.7  52 4.93 1.615 18.52  1  1    4    2
#Toyota Corolla 33.9   4  71.1  65 4.22 1.835 19.90  1  1    4    1
#Toyota Corona  21.5   4 120.1  97 3.70 2.465 20.01  1  0    3    1
#Fiat X1-9      27.3   4  79.0  66 4.08 1.935 18.90  1  1    4    1
#Porsche 914-2  26.0   4 120.3  91 4.43 2.140 16.70  0  1    5    2
#Lotus Europa   30.4   4  95.1 113 3.77 1.513 16.90  1  1    5    2
#Volvo 142E     21.4   4 121.0 109 4.11 2.780 18.60  1  1    4    2
Ronak Shah
  • 377,200
  • 20
  • 156
  • 213
8

You can use eval(parse(text= to evaluate strings as variables:

mtcars %>% filter(eval(parse(text='cyl')) == 4)

enter image description here

Cybernetic
  • 12,628
  • 16
  • 93
  • 132
  • 7
    I loooove this solution! I don't understand why it had to be so hard to specify variable names as variables (having dyplr inside a funcion). This is an amazing and simple to understand solution. – Angelo Oct 14 '19 at 11:54
  • @Angelo because R is a poorly designed language where the specification of something as a name or as a value is up to the callee, not the caller. Hence you end up with a constantly ambiguous situation that need to be resolved on a case by case basis – Stefano Borini Feb 17 '22 at 14:11