22

I receive a string like this from a third-party service:

>>> s
'\\u0e4f\\u032f\\u0361\\u0e4f'

I know that this string actually contains sequences of a single backslash, lowercase u etc. How can I convert the string such that the '\\u0e4f' is replaced by '\u0e4f' (i.e. '๏'), etc.? The result for this example input should be '๏̯͡๏'.

Karl Knechtel
  • 62,466
  • 11
  • 102
  • 153
Soid
  • 2,585
  • 1
  • 30
  • 42

3 Answers3

24

In 2.x:

>>> u'\\u0e4f\\u032f\\u0361\\u0e4f'.decode('unicode-escape')
u'\u0e4f\u032f\u0361\u0e4f'
>>> print u'\\u0e4f\\u032f\\u0361\\u0e4f'.decode('unicode-escape')
๏̯͡๏
Karl Knechtel
  • 62,466
  • 11
  • 102
  • 153
Ignacio Vazquez-Abrams
  • 776,304
  • 153
  • 1,341
  • 1,358
  • 1
    `AttributeError: 'str' object has no attribute 'decode'` – Kid_Learning_C May 14 '19 at 21:58
  • 1
    @Kid_Learning_C You are probably on Python 3 then; this is an old answer which applies to Python 2. See e.g. https://stackoverflow.com/questions/48908131/decodeunicode-escape-in-python-3-a-string – tripleee Jan 29 '20 at 18:11
5

There's an interesting list of encodings supported by .encode() and .decode() methods. Those magic ones in the second table include the unicode_escape.

Tuttle
  • 160
  • 1
  • 7
4

Python3:

bytes("\\u0e4f\\u032f\\u0361\\u0e4f", "ascii").decode("unicode-escape")
Ilya Kharlamov
  • 3,698
  • 1
  • 31
  • 33