How to use str_detect to find specific values in a dataframe

Question

I have a dataset where I am looking to search across the whole dataset for anytime 'ACE' appears (upper case specifically).

My dataset also contains columns with words like 'placed' so it's picking up rows that do not actually have 'ACE' exactly.

My dataset looks like:

Study ID   Title                             Drug        ...
1         Study of placement              Gene1-drug
2         Study of ACE                    Gene1-drug
3         Study of placed treatment        ACE-drug

I am looking to be able to pull out rows 2 and 3 in this case since they both have 'ACE' exactly somewhere in all the columns.

At the moment to get this I've been trying answers from similar questions but haven't got far.

For example I've tried:

test <- df %>% filter(str_detect('ACE', "[[:upper:]]"))

`df %>% filter_all(any_vars(str_detect(., '\\bACE\\b')))` – Ronak Shah Nov 02 '20 at 13:02 — Ronak Shah, Nov 02 '20 at 13:02

score 0 · Answer 1 · answered Nov 02 '20 at 13:05

One way to do this is this:

df <- data.frame(
  v1 = c("a", "b", "c", "d"),
  v2 = c("c", "x", "y", "z")
)

# with `stringr`: 
df[which(str_detect(apply(df, 1, function(x) paste(x, collapse = " ")), "c")),]
# or in `base R`:
df[which(grepl("c", apply(df, 1, function(x) paste(x, collapse = " ")))),]
  v1 v2
1  a  c
3  c  y

score 0 · Answer 2 · answered Nov 02 '20 at 13:06

using base R if you want to try:

> df[apply(df, 1, function(x) sum(grepl('ACE',x))==1),]
# A tibble: 2 x 3
  `Study ID` Title                     Drug      
       <dbl> <chr>                     <chr>     
1          2 Study of ACE              Gene1-drug
2          3 Study of placed treatment ACE-drug  
>

How to use str_detect to find specific values in a dataframe

2 Answers2