2

In a R script, I'd need to create a RegEx that contains strings that may have special characters. So, I should first escape those strings and then use them in the RegEx object.

pattern <- regex(paste('\\W', str, '\\W', sep = ''));

In this example, str should be fixed. So, I'd need a function that returns escaped form of its input. For example 'c++' -> 'c\\+\\+'

frogatto
  • 28,539
  • 11
  • 83
  • 129
  • 1
    sounds interesting, what did you come up with? – rawr Dec 11 '15 at 23:44
  • Would this not be solved by setting the `fixed` argument to `TRUE` in many of the base R regex functions? E.g. `sub("c++", "REPLACED", "c++", fixed = TRUE) == sub("c\\+\\+", "REPLACED", "c++")`. – nrussell Dec 12 '15 at 01:28

1 Answers1

2

I think you have to escape only 12 character, so a conditional regular expression including those should do the trick -- for example:

> gsub('(\\\\^|\\$|\\.|\\||\\?|\\*|\\+|\\(|\\)|\\[|\\{)', '\\\\\\1', 'C++')
[1] "C\\+\\+"

Or you could build that regular expression from the list of special chars if you do not like the plethora of manual backslashes above -- such as:

> paste0('(', paste0('\\', strsplit('\\^$.|?*+()[{', '')[[1]], collapse = '|'), ')')
[1] "(\\\\|\\^|\\$|\\.|\\||\\?|\\*|\\+|\\(|\\)|\\[|\\{)"
daroczig
  • 28,004
  • 7
  • 90
  • 124