5

Python seems to be automatically converting strings (not just input) into raw strings. Can somebody explain what is happening here?

Python 3.7.1 (v3.7.1:260ec2c36a, Oct 20 2018, 14:57:15) [MSC v.1915 64 bit 
(AMD64)] on win32
Type "help", "copyright", "credits" or "license" for more information.
>>> s = '\stest'
>>> s
'\\stest'
# looks like a raw string
>>> print(s)
\stest
>>> s = '\ntest'
>>> s
'\ntest'
# this one doesn't
>>> s = '\n test'
>>> s
'\n test'
>>> s = r'\n test'
>>> s
'\\n test'
>>> print(s)
\n test

The question marked as a duplicate for this one seems to be useful, but then I do not understand why

>>> s = '\n test'
>>> s
'\n test'
>>> repr(s)
"'\\n test'"

does not get two backslashes when called, and does when repr() is called on it.

mariogarcc
  • 386
  • 4
  • 13
  • From my understanding, python seems to be detecting whether the character following the backslash has a meaning if combined with it or not, and if it does have it then it does not escape it. If it has it, it leaves it be. – mariogarcc Dec 21 '18 at 00:05
  • The proposed duplicate didn't actually explain why the terminal output is different for `s` vs `repr(s)`, so reopened. – jpp Dec 21 '18 at 00:30

1 Answers1

5

\n is a valid escape sequence and '\n' is a length 1 string (new line character). In contrast, \s is an invalid escape sequence, so Python is assuming that what you wanted there was a two character string: a backlash character plus an s character.

>>> len('\s')
2

What you saw on terminal output was just the usual representation for such a length 2 string. Note that the correct way to create the string which Python gave you back here would have been with r'\s' or with '\\s'.

>>> r'\s' == '\\s' == '\s'
True

This is a deprecated behavior. In a future version of Python, likely the next point release, your code will be a syntax error.

Since you're using v3.7.1, you could enable warnings if you want to be informed about such uses of deprecated features:

$ python -Wall
>>> '\s'
<stdin>:1: DeprecationWarning: invalid escape sequence \s
'\\s'

As for your subsequent question after the edit:

>>> s = '\n test'
>>> s  # this prints the repr(s)
'\n test'
>>> repr(s)  # this prints the repr(repr(s))
"'\\n test'"
wim
  • 338,267
  • 99
  • 616
  • 750
  • That last addition was what made me understand the issue. So the deprecated behaviour is the introduction of a backslash followed by a "non-meaningful" character, like `\s`, and the correct way to try to introduce said strings from now on will be by escaping any backslash that is not followed by a "meaningful" character, like `n`, `t`, [etc.](https://docs.python.org/3/reference/lexical_analysis.html#string-and-bytes-literals) – mariogarcc Dec 21 '18 at 01:12