11

I am confused here, even though raw strings convert every \ to \\ but when this \ appears in the end it raises error.

>>> r'so\m\e \te\xt'
'so\\m\\e \\te\\xt'

>>> r'so\m\e \te\xt\'
SyntaxError: EOL while scanning string literal

Update:

This is now covered in Python FAQs as well: Why can’t raw strings (r-strings) end with a backslash?

Ashwini Chaudhary
  • 244,495
  • 58
  • 464
  • 504

4 Answers4

10

You still need \ to escape ' or " in raw strings, since otherwise the python interpreter doesn't know where the string stops. In your example, you're escaping the closing '.

Otherwise:

r'it wouldn\'t be possible to store this string'
r'since it'd produce a syntax error without the escape'

Look at the syntax highlighting to see what I mean.

Eric
  • 95,302
  • 53
  • 242
  • 374
9

Raw strings can't end in single backslashes because of how the parser works (there is no actual escaping going on, though). The workaround is to add the backslash as a non-raw string literal afterwards:

>>> print(r'foo\')
  File "<stdin>", line 1
    print(r'foo\')
                 ^
SyntaxError: EOL while scanning string literal
>>> print(r'foo''\\')
foo\

Not pretty, but it works. You can add plus to make it clearer what is happening, but it's not necessary:

>>> print(r'foo' + '\\')
foo\
Lennart Regebro
  • 167,292
  • 41
  • 224
  • 251
5

Python strings are processed in two steps:

  1. First the tokenizer looks for the closing quote. It recognizes backslashes when it does this, but doesn't interpret them - it just looks for a sequence of string elements followed by the closing quote mark, where "string elements" are either (a character that's not a backslash, closing quote or a newline - except newlines are allowed in triple-quotes), or (a backslash, followed by any single character).

  2. Then the contents of the string are interpreted (backslash escapes are processed) depending on what kind of string it is. The r flag before a string literal only affects this step.

Karl Knechtel
  • 62,466
  • 11
  • 102
  • 153
  • It seems the Python scanner stores the 'r' as a token, then goes on to scan the string using the *default* string processing rules, instead of rules where a baskslash is treated as an ordinary character. This issue is discussed at http://stackoverflow.com/q/30283082/3259619. – Logic Knight May 17 '15 at 04:36
3

Quote from https://docs.python.org/3.4/reference/lexical_analysis.html#literals:

Even in a raw literal, quotes can be escaped with a backslash, but the backslash remains in the result; for example, r"\"" is a valid string literal consisting of two characters: a backslash and a double quote; r"\" is not a valid string literal (even a raw string cannot end in an odd number of backslashes). Specifically, a raw literal cannot end in a single backslash (since the backslash would escape the following quote character). Note also that a single backslash followed by a newline is interpreted as those two characters as part of the literal, not as a line continuation.

So in raw string, backslash are not treated specially, except when preceding " or '. Therefore, r'\' or r"\" is not a valid string cause right quote is escaped thus making the string literal invalid. In such case, there's no difference whether r exists, i.e. r'\' is equivalent to '\' and r"\" is equivalent to "\".

laike9m
  • 18,344
  • 20
  • 107
  • 140