4

I'm utilizing the function str_detect from tidyverse to filter out rows of a data frame that match the start of any string from a list. Currently, utilizing | statement when filtering between each column within my filter statement. Is there any way to utilize str_detect across multiple columns without using or statements? The code I'm currently using below works, but does not scale.


Dataframe <- data.frame("names" = c('John','Jill','Joe','Mark'), "Jobs" = c('Mailman','Jockey','Jobhunter',"Nojob"))

Filter_list <- c('Jo')

Dataframe %>% filter(str_detect(names, paste0("^(", paste(Filter_list, collapse = "|"), ")")) |
                     str_detect(Jobs, paste0("^(", paste(Filter_list, collapse = "|"), ")"))

  names      Jobs
1  John   Mailman
2  Jill    Jockey
3   Joe Jobhunter)
Jenks
  • 1,950
  • 3
  • 20
  • 27

2 Answers2

9

You can use filter_at:

Dataframe %>% filter_at(.vars = vars(names, Jobs),
                    .vars_predicate = any_vars(str_detect(. , paste0("^(", paste(Filter_list, collapse = "|"), ")"))))

If you want to apply the filter to all varaibles then you can use filter_all

domaeg
  • 431
  • 2
  • 6
0

I would convert this to long data first and then use str_detect()

DF <- Dataframe %>% mutate(ID = row_number())
Index <- DF %>% gather(key, value, -ID) %>% filter(str_detect(value, Filter_list))
DF %>% filter(ID %in% unique(Index$ID))
akash87
  • 3,876
  • 3
  • 14
  • 30