9

I would like to turn this string:

a = '\\a'

into this one

b = '\a'

It doesn't seem like there is an obvious way to do this with replace?

To be more precise, I want to change the escaping of the backslash to escaping the character a.

john-hen
  • 4,410
  • 2
  • 23
  • 40
elelias
  • 4,552
  • 5
  • 30
  • 45
  • 1
    `\\ ` is just a way to put a backslash into the string. `\a` means that you are escaping `a`. To avoid that, you need to escape the backslash special meaning by putting a second backslash before it. Python prints it as `\\a`, but in reality it's just two characters: the backslash, and `a`. – Maciej Gol Nov 06 '16 at 18:30
  • 1
    try `print(a)`. – inspectorG4dget Nov 06 '16 at 18:31
  • Are you using Python 2 or Python 3? – PM 2Ring Nov 06 '16 at 18:38
  • @Maciej, I want to escape `a`, that's the point. I want to move from a string where the backslashed is escaped, to a string where "a" is escaped. – elelias Nov 06 '16 at 18:39
  • @PM2Ring: python 2, how would you do it in python3? – elelias Nov 06 '16 at 18:41
  • So just to get this perfectly clear, you want to convert the 2 char string `r'\a'` into the single char string `'\a'`. Is that correct? – PM 2Ring Nov 06 '16 at 18:43
  • My actual example does not contain 'a', it's actually ''\x2D", which prints like a minus sign. That's how it should be, the issue is that the string I get gets added an extra backslash and thus I get a string than, when printed, does not contain the minus sign but "\x2D". I want to be able to transform one into the other one – elelias Nov 06 '16 at 18:57
  • There isn't such a thing as "escaping `a`". "Escaping a symbol" means putting a backslash in front of it *so that it will be treated as the actual symbol, instead of* some other special meaning. But `a` has no special meaning, it's just a letter. On the other hand, the sequence of a backslash followed by `a` **does** have a special meaning. – Karl Knechtel Jan 09 '23 at 05:25
  • Based on the comments, it seems that the goal is to go from a string **that actually contains** a backslash, a lowercase x, a digit 2, and an uppercase D (as if one had written `"\\x2d"` in code, to a string **that actually contains** a minus sign, **as if** one had written `"\x2d"` in code). I have closed it as a duplicate accordingly. – Karl Knechtel Jan 09 '23 at 05:27

3 Answers3

8

The character '\a' is the ASCII BEL character, chr(7).

To do the conversion in Python 2:

from __future__ import print_function
a = '\\a'
c = a.decode('string-escape')
print(repr(a), repr(c))

output

'\\a' '\x07'

And for future reference, in Python 3:

a = '\\a'
b = bytes(a, encoding='ascii')
c = b.decode('unicode-escape')
print(repr(a), repr(c))

This gives identical output to the above snippet.

In Python 3, if you were working with bytes objects you'd do something like this:

a = b'\\a'
c = bytes(a.decode('unicode-escape'), 'ascii')
print(repr(a), repr(c))

output

b'\\a' b'\x07'

As Antti Haapala mentions, this simple strategy for Python 3 won't work if the source string contains unicode characters too. In tha case, please see his answer for a more robust solution.

PM 2Ring
  • 54,345
  • 6
  • 82
  • 182
6

On Python 2 you can use

>>> '\\a'.decode('string_escape')
'\x07'

Note how \a is repr'd as \x07.

If the string is a unicode string with also extended characters, you need to decode it to a bytestring first, otherwise the default encoding (ascii!) is used to convert the unicode object to a bytestring first.


However, this codec doesn't exist in Python 3, and things are very much more complicated. You can use the unicode-escape to decode but it is very broken if the source string contains unicode characters too:

>>> '\aäầ'.encode().decode('unicode_escape')
'\x07äầ'

The resulting string doesn't consist of Unicode characters but bytes decoded as latin-1. The solution is to re-encode to latin-1 and then decode as utf8 again:

>>> '\\aäầ\u1234'.encode().decode('unicode_escape').encode('latin1').decode()
'\x07äầሴ'
2

Unescape string is what I searched for to find this:

>>> a = r'\a'
>>> a.encode().decode('unicode-escape')
'\x07'
>>> '\a'
'\x07'

That's the way to do it with unicode. Since you're in Python 2 and may not be using unicode, you may actually one:

>>> a.decode('string-escape')
'\x07'
Community
  • 1
  • 1
Trey Hunner
  • 10,975
  • 4
  • 55
  • 114