14

Escape characters cause a lot of trouble in R, as evidenced by previous questions:

  1. Change the values in a column
  2. Can R paste() output "\"?
  3. Replacing escaped double quotes by double quotes in R
  4. How to gsub('%', '\%', ... in R?

Many of these previous questions could be simplified to special cases of "How can I get \ out of my way?"

Is there a simple way to do this?

For example, I can find no arguments to gsub that will remove all escapes from the following:

 test <- c('\01', '\\001')
Community
  • 1
  • 1
David LeBauer
  • 31,011
  • 31
  • 115
  • 189
  • `\0` is the nul string. R hasn't allowed those in strings for a few versions. Which version of R are you using? – Joshua Ulrich Apr 09 '12 at 16:46
  • @Josh: 2.14. Is there no way to extract "0" from "\0"? – David LeBauer Apr 09 '12 at 16:48
  • Just to be clear: you want to remove all instances of "\?" where "\" is taken literally and "?" means any single character? Or do you just want to strip all instances of"\" except when it's "\\" ? I'd still go with the regex construction ` [\\]{1,}` . Edit: heck, even the parser for edits here messes up escapes :-( – Carl Witthoft Apr 09 '12 at 16:58
  • @CarlWitthoft I'd like to strip all of the instances of escapes, even with "\\". regex is fine but I am not proficient. – David LeBauer Apr 09 '12 at 17:00
  • No, because `\01` isn't what you think it is; try `cat(test,'\n',sep=" "); print(test)`. You're confusing the actual string (the output from `cat`) with the printed representation of the string (the output from `print`). – Joshua Ulrich Apr 09 '12 at 17:03

2 Answers2

10

The difficulty here is that "\1", although it's printed with two glyphs, is actually, in R's view a single character. And in fact, it's the very same character as "\001" and "\01":

nchar("\1")
# [1] 1
nchar("\001")
# [1] 1
identical("\1", "\001")
# [1] TRUE

So, you can in general remove all backslashes with something like this:

(test <- c("\\hi\\", "\n", "\t", "\\1", "\1", "\01", "\001"))
# [1] "\\hi\\" "\n"     "\t"      "\\1"    "\001"   "\001"   "\001"  
eval(parse(text=gsub("\\", "", deparse(test), fixed=TRUE)))
# [1] "hi"  "n"   "t"   "1"   "001" "001" "001"

But, as you can see, "\1", "\01", and \001" will all be rendered as 001, (since to R they are all just different names for "\001").


EDIT: For more on the use of "\" in escape sequences, and on the great variety of characters that can be represented using them (including the disallowed nul string mentioned by Joshua Ulrich in a comment above), see this section of the R language definition.

Josh O'Brien
  • 159,210
  • 26
  • 366
  • 455
2

I just faced the same issue - if you want any \x where x is a character then I am not sure how, I wish I knew, but to fix it for a specific escape sequence,. say \n then you can do

new = gsub("\n","",old,fixed=T)

in my case, I only had \n

user1617979
  • 2,370
  • 3
  • 25
  • 30