1

How can I remove all backslashes from this string?

t1 <- "1\2\3\4\5"

Output:

"1\002\003\004\005"

desired output:

"1002003004005"

Thank you!

Pascal Schmidt
  • 223
  • 2
  • 12
  • 2
    Those are not backslashes, those are unicode letters. – r2evans May 04 '20 at 02:33
  • 1
    @Pascal you can verify it using `writeLines(t1)`. Unable to share you the output as those as unicodes. Try yourself. – nikn8 May 04 '20 at 02:36
  • 2
    Really, getting that output from that input is not easy or advisable on a few levels. You could perhaps start with `sub("^[^]]*\\]\\s*", "", capture.output(charToRaw(t1)))`, which yields `"31 02 03 04 05"`, but that's obviously flawed due to the `31` (which is the raw bit encoding for the `"1"` character). It's easy enough string-wise to convert spaces and such, but ... to me, it sounds like there might be wrong assumptions about that data if you want `"1002"` from `"1\2"`. – r2evans May 04 '20 at 02:47

2 Answers2

1

Here you go.

stringr::str_remove_all(stringi::stri_escape_unicode(t1), "\\\\u0")

gives output as

[1] "1002003004005"
nikn8
  • 1,016
  • 8
  • 23
1

This one is tricky, because "1\002\003\004\005" isn't really a valid string to begin with. To see this:

> writeLines(t1)
1

However, we can first deparse it to create valid string.

t2 <- deparse(t1)
> t2
[1] "\"1\\002\\003\\004\\005\""

And then use a regular gsub to remove the \ and quotes we added as a side effect.

t3 = gsub('\\', '', t2, fixed = TRUE)
t3 = gsub('\"', '', t3)

More ideally, we'd write a compound regex.

t3 = gsub('[(\")(\\)]', '', t2)
> t3
[1] "1002003004005"

Edit: As a oneliner:

gsub('[(\")(\\)]', '', deparse(t1))

You can refer below link for more details on the pattern mapping using gsub:

How do I deal with special characters like \^$.?*|+()[{ in my regex?

https://rstudio.com/wp-content/uploads/2016/09/RegExCheatsheet.pdf

rj-nirbhay
  • 649
  • 7
  • 23
Branson Fox
  • 339
  • 1
  • 8