3

I would like to delete the lines which contain the opening bracket "(" from my dataframe.

I tried the following:

df[!grepl("(", df$Name),] 

But this does not track down the ( sign

Sotos
  • 51,121
  • 6
  • 32
  • 66
nemja
  • 459
  • 4
  • 19
  • 3
    The `(` is understood by the grep expression as a part of the regex and not as a character. Try to escape the open brackets: `\\(` and see if this will work. You can find more details here: https://stackoverflow.com/questions/27721008/how-do-i-deal-with-special-characters-like-in-my-regex – Deena Aug 11 '17 at 12:01
  • like this? df[!grepl(\\(, df$Name),] – nemja Aug 11 '17 at 12:02

1 Answers1

6

You have to double-escape the ( with \\.

x <- c("asdf", "asdf", "df", "(as")

x[!grepl("\\(", x)]
# [1] "asdf" "asdf" "df"  

Just apply this to your df like df[!grepl("\\(", df$Name), ]

You could also think about removing all puctuation characters by using regex:

x[!grepl("[[:punct:]]", x)]

As pointed out by @CSquare in the comments, here is a great summary about special characters in R regex


Additional input from the comments:
@Sotos: Gaining performance with pattern='(' and fixed = TRUE since the regex could be bypassed.

x[!grepl('(', x, fixed = TRUE)]
loki
  • 9,816
  • 7
  • 56
  • 82
  • 2
    I would suggest to use `fixed = TRUE` instead of escaping that parenthesis i.e. (`x[!grepl('(', x, fixed = TRUE)]`) which is going to be more efficient as it bypasses the regex engine – Sotos Aug 11 '17 at 12:19
  • `grep("(", x, fixed=TRUE, invert=TRUE, value=TRUE)` – jogo Aug 11 '17 at 12:24
  • Thanks for the hints, added it to the A. – loki Aug 11 '17 at 13:11