Here are 2 tidyverse
ways. I added an additional entry to your vector in order to check that all keywords would be checked against, not just the first one.
Since you said this is df$a
, I made a tibble df
, of which a
is the only column, just to fit better with dplyr
operations that are generally data frame based.
library(tidyverse)
a <- c("hj**a**jk", "fgfg", "re", "rec")
df <- tibble(a = a)
keywords <- c("a", "b", "c")
The more dplyr
way is to start with the data frame, and then pipe it into your filtering operation. The problem is that stringr::str_detect
works oddly here—it expects to be looking for matches along an entire vector, when in this case we want that to happen for each row. Adding rowwise
lets you do that, and filter df
for just rows where the value in a
matches any of the keywords.
df %>%
rowwise() %>%
filter(str_detect(a, keywords) %>% any())
#> Source: local data frame [2 x 1]
#> Groups: <by row>
#>
#> # A tibble: 2 x 1
#> a
#> <chr>
#> 1 hj**a**jk
#> 2 rec
This second way was more intuitive for me, but fits less in the dplyr
way. I mapped over a
—not the column in df
, but just the standalone character vector—to check for any matches. Then I used this as my filtering criteria. Normally dplyr
operations are setup so the value you're piping in is the first argument of the function, generally a data frame. But because I was actually piping in the second argument to filter
, not the first, I specified df
for the first argument, and used the shorthand .
for the second.
a %>%
map_lgl(~str_detect(., keywords) %>% any()) %>%
filter(df, .)
#> # A tibble: 2 x 1
#> a
#> <chr>
#> 1 hj**a**jk
#> 2 rec
Created on 2018-06-04 by the reprex package (v0.2.0).