0

For the specified df, I was wondering how do I go about searching all fields in each row with the specified keywords and if there is a match on the same row regardless of which column then a new column will indicate "yes" and NA for mismatches. Thanks!

df <- data.frame(a = c("hi", "are", "you", "okay"),
                 b = c("I", "am", "okay", "thanks"),
                 c = c("how", "are", "you", "okay"))


keywords <- c("are", "you")

Desired results

df <- data.frame(a = c("hi", "are", "you", "okay"),
                 b = c("I", "am", "okay", "thanks"),
                 c = c("how", "are", "you", "okay"),
                 match = c(NA, "yes", "yes", NA))
Xin
  • 666
  • 4
  • 16

1 Answers1

0

Here is a base R solution using the apply function and the logical grep.

df <- data.frame(a = c("hi", "are", "you", "okay"),
                 b = c("I", "am", "okay", "thanks"),
                 c = c("how", "are", "you", "okay"))
keywords <- c("are", "you")

#collapse keyword list into regular expression separated by the OR operator
searchstring<-paste0(keywords, collapse = "|")

#search by row and return TRUE FALSE
df$match<-apply(df, 1, function(row) {any(grepl(searchstring, row))})
df
Dave2e
  • 22,192
  • 18
  • 42
  • 50