Selecting multiple rows based on data containing string in list

Question

I have a dataframe df with a column of text strings and a separate list of values:

c1 <- c("Jim Mackinnon","Jane Smit","Sunday 9-10","Wednesday 14-15","Friday 19-20")
c2 <- c("1123","4923","6924","4301","5023")
df <- as.data.frame(c2,c1)
df
           c1     c2
Jim Mackinnon   1123
Jane Smit       4923
Sunday 9-10     6924
Wednesday 14-15 4301
Friday 19-20    5023

list_values <- c("Sunday","Monday","Tuesday","Wednesday","Thursday","Friday","Saturday")

The aim is to select only those rows containing a value in c1 that contains one of the strings in list_values. In the example, this would mean selecting only rows 3-5 and discarding the rest. Is there a way of doing this without iteration?

score 5 · Accepted Answer · answered Aug 19 '20 at 05:55

You can paste all the list_values in one string and use grepl to find the rows

subset(df,grepl(paste0(list_values, collapse = "|"), rownames(df)))

Note that you used as.data.frame which made c1 as rownames. If what you actually meant was to use data.frame then you can do :

df <- data.frame(c2,c1)

subset(df,grepl(paste0(list_values, collapse = "|"), c1))

#    c2              c1
#3 6924     Sunday 9-10
#4 4301 Wednesday 14-15
#5 5023    Friday 19-20

You can also use this with tidyverse functions :

library(dplyr)  
library(stringr)

df %>% filter(str_detect(c1, str_c(list_values, collapse = "|")))

Selecting multiple rows based on data containing string in list

1 Answers1