Using variables in `dplyr` filter

Question

I have made a database with consumers and each column is a condition that they meet. I want to select a bases of only meeting each condition so I made for loop, but the filter function seems not to work and in every base I get 0 results, event thou I know I should get something:

database <- data.frame(ID = 1:10, Con1 = c(1,1,0,1,0,0,0,1,0,1), Con2 = c(1,0,0,0,0,0,0,0,0,0))
varibles <- names(database)[2:3]

for(i in 1:length(varibles) ){
  tmp <- database %>%  
    filter_(varibles[i] == 1) 
}

I read that I should use filter with "_" but it dose not work (Use variable names in functions of `dplyr`)

I solved the problem not using the dplyr:

  tmp <- database  
  tmp <- tmp[tmp[, varibles[i]] == 1, ]

akrun · Accepted Answer · 2017-09-01T08:41:39.133

Perhaps, we don't need a loop, use the filter_at.

If we need to filter rows having either of the 'Con' values are 1, then we use the any_vars to quote that a predicate expression should be applied to the variables mentioned in the .predicate (here we use the index. If we need the string names, then wrap it with vars(matches("Con"))

database %>%
     filter_at(2:3, any_vars(.==1))

Suppose, if we need to have 1 for both the columns, use the all_vars

database %>%
     filter_at(2:3, all_vars(.==1))

For multiple datasets, initiate a list and store the output from each iteration inside it

tmp <- setNames(vector("list", length(varibles)), varibles)
for(i in seq_along(varibles)){
  tmp[[i]] <- database %>%  
              filter_at(vars(varibles[i]), all_vars(. == 1)) 
}

Or with sym from rlang

tmp <- setNames(vector("list", length(varibles)), varibles)
for(i in seq_along(varibles)){
  tmp[[i]] <- database %>%  
             filter(UQ(rlang::sym(varibles[i])) == 1) 
}

tmp
#$Con1
#  ID Con1 Con2
#1  1    1    1
#2  2    1    0
#3  4    1    0
#4  8    1    0
#5 10    1    0

#$Con2
#  ID Con1 Con2
#1  1    1    1

The above approaches were doing using R 3.4.1 and dplyr_0.7.2. As the OP mentioned some difficulties in updating the R to a new version, we tried the get approach using R 3.1.3 and dplyr_0.4.3

tmp <- setNames(vector("list", length(varibles)), varibles)
for(i in seq_along(varibles)){ 
     tmp[[i]] <- database %>% 
                    filter(get(varibles[i], envir = as.environment(.))==1)
 }

tmp
#$Con1
#  ID Con1 Con2
#1  1    1    1
#2  2    1    0
#3  4    1    0
#4  8    1    0
#5 10    1    0

#$Con2
#  ID Con1 Con2
#1  1    1    1

No, I need to get separate datafremes for each condition as I work on them later and upload to the server — AAAA, Sep 01 '17 at 07:18
I tried using your code but it dose not work. The method for filter_at produce error: " Error in function_list[[k]](value) : could not find function "filter_at" " and it appears also when I copy your code without changes. I can not also use " filter(UQ(rlang::sym(varibles[i])) == 1) " as i have old R and package ‘rlang’ is not available (for R version 3.3.1) — AAAA, Sep 01 '17 at 07:52
@MariuszSiatka I am using R 3.4.1. YOu could update the R to new version — akrun, Sep 01 '17 at 07:55
It is also why "filter_at" dose not work? Updating R is not that easy as at work I have strict rules regrading program installation, and getting all approvals will take much time and effort — AAAA, Sep 01 '17 at 08:02
@MariuszSiatka I tried with `3.1.3` using the linked post `filter(get(varibles[i], envir = as.environment(database))==1` works for me. The full code is `tmp <- setNames(vector("list", length(varibles)), varibles); for(i in seq_along(varibles)){ tmp[[i]] <- database %>% filter(get(varibles[i], envir = as.environment(database))==1) }` — akrun, Sep 01 '17 at 08:37

Using variables in `dplyr` filter

1 Answers1