0

This explanation is from the python documentation:

Both string and bytes literals may optionally be prefixed with a letter 'r' or 'R'; such strings are called raw strings and treat backslashes as literal characters. As a result, in string literals, '\U' and '\u' escapes in raw strings are not treated specially. Given that Python 2.x’s raw unicode literals behave differently than Python 3.x’s the 'ur' syntax is not supported.

If raw strings treat backslashes as char literals, why does the backslash need to be escaped in the expression:

re.compile(r"'\\'")

Instead of just being able to write:

re.compile(r"'\'")

To capture a single backslash when using the re module?

Victor Brunell
  • 5,668
  • 10
  • 30
  • 46
  • 1
    Cannot duplicate. Provide a complete failure case. – Ignacio Vazquez-Abrams Jan 15 '16 at 22:06
  • Ask yourself this question: How do you differentiate between `\d`for capturing digits and `\d` as backslash and `d` ? – Iron Fist Jan 15 '16 at 22:06
  • 3
    First escape is for Python string literal representation (not needed when `r` prefix is used), second escape is for `re` module itself. – Łukasz Rogalski Jan 15 '16 at 22:08
  • 1
    The engine sees this `'\'` as an single quote followed by an escaped single quote. Since an escaped quote has to special meaning to the regex engine, it will match two single quotes. However, if you passed in a single escape by itself (or if it is at the end of the regex), the regex engine would throw an error, something like `Unterminated escape sequence`. That is because in regex land, a single escape cannot exist by itself. –  Jan 15 '16 at 22:14
  • 1
    Unfairly marked Duplicate as the regex question is quite different. In python raw strings r"raw \ string", you cannot end the string in a single backslash but embedded backslashes are OK, unless followed by a quote. When followed by a quote, the backslash escapes the quote, but the backslash remains part of the string. Also, you cannot truly escape a backslash in a raw string: r'\\' is a string of length 2, whereas r'\' is a syntax error :-( This is really quite broken - python's implementation of raw string is braindead. – joeking Oct 30 '17 at 23:25

1 Answers1

2

because '\' has special meaning in re it means escape the character after it in the language you use to define a re so if you want to match '+' as a character your re will be '\+'

m7mdbadawy
  • 920
  • 2
  • 13
  • 17