0

I am trying to write a regex in python where I wish to replace the escape characters like \n, \t, etc with \\n , \\t etc.

I have tried this to just escape the newline and tabs.

re.sub(r'\t',r'\\t',re.sub(r'\n',r'\\n',text))

eg:

>>> print re.sub(r'\t',r'\\t',re.sub(r'\n',r'\\n','ads;lfkjaldsf\ndsklajflad\tkjhklajf\n'))
ads;lfkjaldsf\ndsklajflad\tkjhklajf\n

Suppose I have text say "\a\b\c\d\n\g\h\t" then it need not add double backslashes to non escape characters.

So here I don't need to escape every backslash with a double backslash but every special escape character with double backslash.

Any help is appreciated.

Akshay Hazari
  • 3,186
  • 4
  • 48
  • 84
  • 4
    _"Will doing simply this work?"_ Did it work when you tried it? – Kevin Jan 07 '16 at 12:34
  • 2
    You can list all of them like in this answer: http://stackoverflow.com/a/18935765/650405 – Karoly Horvath Jan 07 '16 at 12:51
  • You cannot "replace the escape characters" in the text because it **does not contain them**; escape sequences are **only** relevant to *literal strings in the source code*. By the time you call `re.sub` and pass it a text like `'\n'`, it is too late to look for backslashes followed by lowercase n - they aren't there to be found. Either you want to replace **the newline** with an actual, **single** backslash followed by n; or you want to have that actual text in the first place. – Karl Knechtel Aug 08 '22 at 03:10

1 Answers1

1

I found re.escape as pointed to by Karoly Horvath. This is how it works.

>>> re.escape('ads;lfkjaldsf\ndsklajflad\tkjhklajf\n')
'ads\\;lfkjaldsf\\\ndsklajflad\\\tkjhklajf\\\n'

Update:

While I see re.escape escapes a lot too much. Spaces , semicolons and lot many characters which don't need to be escaped in my case.

>>> re.sub(r'(\n|\t|\"|\')',lambda m:{'\n':'\\n','\t':'\\t','\'':'\\\'','\"':'\\\"'}[m.group()], "hello hi  \n \'GM\' \t TC  \n \"Bye\" \t")
'hello hi  \\n \\\'GM\\\' \\t TC  \\n \\"Bye\\" \\t'

This is what I figured out which really helped.

Akshay Hazari
  • 3,186
  • 4
  • 48
  • 84
  • This is the **wrong tool for the job**. `re.escape` is used for escaping **the regex text**, according to *the rules that the regex engine will use to interpret the string*. What you actually want is either: 1) create a **search text** that actually contains the backslashes etc. in the first place; or 2) replace **a newline** in the text with a **single** backslash followed by `n`, etc. I have added duplicate links for both of these possibilities, since it is not clear what you actually want the code to do and seem to be generally confused about the input. – Karl Knechtel Aug 08 '22 at 03:12