0

I need to apply regex replacement to a string in order to escape certain characters of a UNIX file path. In the original string I want to match a range of characters, and then replace it with the same character prepended with a backslash.

I'm trying to do this in the following way:

re.sub(r'([ \'\[\]])', r'\\\1', "./file to'escape.txt")

which, according to regex rules, should return ./file\ to\'escape.txt, but instead it returns ./file\\ to\\'escape.txt

Other variants of the replacement string that I've tried don't work either:

  • r'\\1' -> ./file\\1to\\1escape.txt
  • '\\\1' -> ./file\\\x01to\\\x01escape.txt
  • '\\\\1' -> ./file\\1to\\1escape.txt
  • '\\\\\1' -> ./file\\\x01to\\\x01escape.txt

Is it possible at all to have an escaped backslash followed by a special sequence in python regex?

BartoNaz
  • 2,743
  • 2
  • 26
  • 42
  • 2
    It [doesn't](https://repl.it/Crrx) – Thomas Ayoub Aug 25 '16 at 18:24
  • See http://ideone.com/pZoXIB, your code works. – Wiktor Stribiżew Aug 25 '16 at 18:24
  • Since it's a raw string literal, the engine gets \\ plus \1. Then the engine interpolates escapes to form the formatted output. It's better to use `$n` for capture buffer vars if python supports that. Try the dollar method \\$1 and see what you get. What version python is it you're using? –  Aug 25 '16 at 18:33
  • @WiktorStribiżew, it works because of the print command, which escapes the double backslashes once again. But the string returned by the regex itself is as I stated in the question. – BartoNaz Aug 25 '16 at 18:36
  • @sln, I'm using python 2.7.10, and `\\$1` inserts it literally, with the dollar sign. – BartoNaz Aug 25 '16 at 18:38
  • Ok, I see from **[here](http://stackoverflow.com/questions/3519487/python-regex-re-sub-replacing-multiple-parts-of-pattern)** python is antiquated and uses backreference notation to specify a capture group in a replacement string. Pretty strange, when they discovered it overlaps with the octal construct, everybody else switched to `$` notation (which has it's own problems as well). I'd switch to using the newer _regex_ module and come into the modern world. –  Aug 25 '16 at 18:52
  • You're confusing string content and representation. The content will be `./file\ to\'escape.txt`, while the representation is `"./file\\ to\\'escape.txt"`. `print` doesn't escape anything. – mata Aug 25 '16 at 18:54
  • @mata, ok, but then I need to used this string in the `with open(replaced_string) as input_file:` call and apparently it uses content, and doesn't find the file. – BartoNaz Aug 25 '16 at 19:02
  • 1
    So your file name has literal backslashes in it? That seems odd. Where do you get that name from? Are you sure your file isn't really named `./file to'escape.txt`? – mata Aug 25 '16 at 19:15
  • @mata, you're right. I thought that I have to provide an escaped path like in Terminal, but this is not the case. Thanks! – BartoNaz Aug 25 '16 at 19:21
  • I believe that is will help you: http://stackoverflow.com/questions/6866696/string-replace-with-backslashes-in-python http://stackoverflow.com/questions/5186839/python-replace-with – Mateus Padua Mar 30 '17 at 19:14

0 Answers0