77

I would like to exclude lines containing a string "REVERSE", but my lines do not match exactly with the word, just contain it.

My input data frame:

   Value   Name 
    55     REVERSE223   
    22     GENJJS
    33     REVERSE456
    44     GENJKI

My expected output:

   Value   Name 
    22     GENJJS
    44     GENJKI
Tunaki
  • 132,869
  • 46
  • 340
  • 423
user3091668
  • 2,230
  • 6
  • 25
  • 42

7 Answers7

123

This should do the trick:

df[- grep("REVERSE", df$Name),]

Or a safer version would be:

df[!grepl("REVERSE", df$Name),]
Rich Scriven
  • 97,041
  • 11
  • 181
  • 245
Pop
  • 12,135
  • 5
  • 55
  • 68
  • 6
    What do you mean by "safer"? – Jason Melo Hall Jan 24 '17 at 21:19
  • What if I want to delete the rows containing a "(". The following does not seem to work: df[!grepl("(", df$Name),] – nemja Aug 11 '17 at 11:56
  • 2
    @nemja The `grepl` function uses regular expressions for the match, which have a syntax where `(` is meaningful. If you set the named parameter `fixed = TRUE` then `grepl` will perform a literal match without using regular expressions, which should work for your use case. – cuttlefish Oct 11 '17 at 20:27
  • 3
    @JasonMeloHall minus (-) operator does use negative indexing and negation (!) operator uses logical indexing so negation operator is safer than minus(-) – shantanu pathak Aug 30 '18 at 11:58
  • How could you modify this to also delete the row above the row that contains the matching string? – KNN Feb 26 '19 at 03:08
  • Thanks a lot - googled for about 1h before I found your answer. Works brilliant. – Simone Aug 20 '19 at 10:30
  • Hi guys, I tried the code in my contigency table making, although I save my dataframe as a new data, the filtered rows are still kept as a level with "0" entries. how do I ask it to completely delete the rows in the new dataframe? – ML33M Mar 11 '20 at 16:24
31

You could use dplyr::filter() and negate a grepl() match:

library(dplyr)

df %>% 
  filter(!grepl('REVERSE', Name))

Or with dplyr::filter() and negating a stringr::str_detect() match:

library(stringr)

df %>% 
  filter(!str_detect(Name, 'REVERSE'))
sbha
  • 9,802
  • 2
  • 74
  • 62
  • 2
    This question asks for many strings. So what happens if you want to remove multiple strings i.e. `remove.list <- c("REVERSE", "FOO", "BAR, "JJ")` – micstr Aug 14 '18 at 14:26
  • 1
    Sure, you can do create the list like this: `remove.list <- paste(c("REVERSE", "FOO", "BAR", "JJ"), collapse = '|')` And then filter like this: `df %>% filter(!grepl(remove.list, Name))` `df %>% filter(!str_detect(Name, remove.list))` – sbha Aug 14 '18 at 14:42
  • How to drop rows based in strings present in two columns, like `filter(!grepl('REVERSE', Name & 'BAR', Name))`, I got the following error trying this way. `! operations are possible only for numeric, logical or complex types` – CelloRibeiro Jul 04 '22 at 15:00
21

Actually I would use:

df[ grep("REVERSE", df$Name, invert = TRUE) , ]

This will avoid deleting all of the records if the desired search word is not contained in any of the rows.

BobD59
  • 221
  • 2
  • 4
6

You can use stri_detect_fixed function from stringi package

stri_detect_fixed(c("REVERSE223","GENJJS"),"REVERSE")
[1]  TRUE FALSE
bartektartanus
  • 15,284
  • 6
  • 74
  • 102
4

You can use this function if it's multiple string df[!grepl("REVERSE|GENJJS", df$Name),]

3

You can use it in the same datafram (df) using the previously provided code

df[!grepl("REVERSE", df$Name),]

or you might assign a different name to the datafram using this code

df1<-df[!grepl("REVERSE", df$Name),]
Mohamed Rahouma
  • 1,084
  • 9
  • 20
1

A late answer building on BobD59's and hidden-layer's responses.

This removes multiple specific strings, whilst avoiding deleting all of the records if the desired search word is not contained in any of the rows.

df1 <-
   df[!grepl("REVERSE|GENJJS", df$Name), (invert = TRUE), ]
sbooth
  • 27
  • 5