1

I came across this stack overflow QA: https://stackoverflow.com/a/55479243/11799491

and I want to know how to select all rows that do not match the string detect from the accepted answer. I tried using a ! in front of str_detect and it did not work.

Dataframe %>% filter_at(.vars = vars(names, Jobs),
                    .vars_predicate = any_vars(!str_detect(. , paste0("^(", paste(Filter_list, collapse = "|"), ")"))))

Thank you in advance for your help!

Gabriella
  • 312
  • 2
  • 11

1 Answers1

2

In the new version of dplyr i.e. 1.0.4, we can use if_any within filter

library(dplyr)
library(stringr)
Dataframe %>% 
  filter(!if_any(c(names, Jobs),
     ~ str_detect(., str_c("^(", str_c(Filter_list, collapse="|"), ")"))))
#    names  Jobs
#1  Mark Nojob

The "Nojob" is not matched because we are checking whether the string starts (^) with "Jo" (also the case is different)


In the older version, we can negate (!) with all_vars

Dataframe %>%
   filter_at(.vars = vars(names, Jobs),
                   .vars_predicate = all_vars(!str_detect(. , paste0("^(", paste(Filter_list, collapse = "|"), ")"))))
#   names  Jobs
#1  Mark Nojob

The reason why any_vars with ! didn't work is that it is looking for any column that doesn't have a match for the string. So, if one of the column row doesn't have that match while the other have it, then it returns that row. Whereas with all_vars and negate, it will only return that row, when all those columns specified in vars are not matching

In the previous version, we cannot negate (!) in front of any_vars whereas it is not the case with if_any as if_any is returning a logical vector to be passed directly to filter whereas any_vars is doing it indirectly to filter_at

NOTE: The function wrapper that corresponds to all_vars is if_all in the current version

data

Dataframe <- data.frame("names" = c('John','Jill','Joe','Mark'), "Jobs" = c('Mailman','Jockey','Jobhunter',"Nojob"))

Filter_list <- c('Jo')
akrun
  • 874,273
  • 37
  • 540
  • 662