0

I'm doing some batch string clean up and a lot of the entries look like this:

"ABC\Company Co."

Which causes weird errors, and I can't seem to remove the backslash.

For example, try entering this into your console:

gsub("BLAH", "", "BLAH\WHAT")

and you get:

Error: '\W' is an unrecognized escape in character string starting ""BLAH\W"

I know that it's thinking \W is a command.. I'm actually suprised that gsub's 'interpreting' x, since x is just the string I want to sub out. I don't get why gsub cares what's actually in x, just that it should replace "BLAH" with "" within "BLAH\WHAT"...

The obvious solution would be to remove the \ from the string ahead of time.

gsub("\\", "", "BLAH\WHAT")

But then you get the exact same error message!

Thoughts? Thanks!

Zini
  • 909
  • 7
  • 15
wizard_draziw
  • 505
  • 1
  • 5
  • 17
  • It's not `gsub`. `x <- "BLAH\WHAT"` won't work either. – GSee Oct 06 '14 at 21:17
  • You need to escape it because when specifying a string "\" is the escape character. So your last argument should be "BLAH\\WHAT" – konvas Oct 06 '14 at 21:17
  • No. It's thinking "\W" is ctrl-W which is not an accepted character. See `?Quotes`. This would do what you expected: `gsub("BLAH", "", "BLAH\\WHAT") [1] "\\WHAT"` – IRTFM Oct 06 '14 at 21:17
  • I'm receiving the data from an outside source. So unfortunately what you're suggesting to do is exactly what I'm asking HOW to do. The string is already "BLAH\WHAT", and I need to convert it programatically to either "BLAH\\WHAT" or just "BLAHWHAT" – wizard_draziw Oct 06 '14 at 21:20
  • 3
    If you made a reproducible example this would be easier. You would find that a file containing \W would look like it was read in as "\\W" – IRTFM Oct 06 '14 at 21:22
  • Could you clean up the data outside of R? – blakeoft Oct 06 '14 at 21:23
  • The above scenario is exactly the same as what I'm dealing with in R Studio, so I don't know how I can make it more reproducible? If I can make gsub remove the backslash to a list of strings (or even a single string) then I'm golden. – wizard_draziw Oct 06 '14 at 21:23
  • No I cannot modify the outside source, it's from our core database – wizard_draziw Oct 06 '14 at 21:23
  • Post what the file looks like when viewed in a text editor. Then post the code you used to read it in. How could that be any simpler? – IRTFM Oct 06 '14 at 21:24
  • When I try to read in a file that has a string with a backslash, R converts the backslash to a double backslash. So `ABC\Company Co.` becomes `ABC\\Company Co.` – blakeoft Oct 06 '14 at 21:30
  • Well RODBC must be keeping it. Anyway, the how im getting the data is irrelevant, point is the string managed to keep a backslash, how do i get rid of it? – wizard_draziw Oct 06 '14 at 21:46
  • Im not reading from a csv its from an access database and i cant post that – wizard_draziw Oct 06 '14 at 21:47
  • I hate to say this, but it might be a good idea to post a small screenshot (!!). It's hard to see how it's possible even to generate a string that R would print as `"ABC\Company Co."`. Not saying it's impossible, but it seems *extremely* odd ... – Ben Bolker Oct 06 '14 at 21:51
  • 1
    It sounds like you're able to get the data into R in some capacity. Does `x` contain the string and if so, can you post the output of `dput(x)` – GSee Oct 06 '14 at 22:11
  • Sorry for the delay-- I did a dput on the string and it turns out it *is* as others have mentioned, there really is two backslashes, so really the string is like "BLAH\\WHAT" – wizard_draziw Oct 07 '14 at 00:54
  • That being said, if I go: gsub("\", "", "BLAH\\WHAT"), or "\\" or even "\\\" it still doesn't let me get rid of the backslash – wizard_draziw Oct 07 '14 at 00:54

1 Answers1

2

Use

gsub("\\\\", "", "BLAH\\WHAT")

which gives

[1] "BLAHWHAT"

To produce one backslash, you need to escape it using a \. Thus, "\\\\" produces two backslashes, which matches the two inside "BLAH\\WHAT".

See these related questions:

How to escape a backslash in R?

How to escape backslashes in R string

Community
  • 1
  • 1
Alex
  • 15,186
  • 15
  • 73
  • 127