1

I have a string like "válido" . Usually python can convert this to hex based easily on the command prompt and this would become 'v\xc3\x83\xc2\xa1lido'

But I want to use \u for the unicode codepoints, so I want the output like "v\u00c2\u00a1lido"

So basically the input should be "válido" and the output should be "v\u00c2\u00a1lido"

Adobri
  • 471
  • 7
  • 15
  • possible duplicate of [What's the preferred way to include unicode in python source files?](http://stackoverflow.com/questions/23062544/whats-the-preferred-way-to-include-unicode-in-python-source-files) – wim Apr 23 '14 at 19:54
  • I am not sure how this is the duplicate. If you look at the example above, some characters in the inputs strings are normal ascii characters and so is the case in the output too. – Adobri Apr 23 '14 at 19:57
  • I didn't really mean it's a duplicate, but you will certainly find your answers in the post – wim Apr 23 '14 at 20:14

2 Answers2

1

\u only works in Unicode strings; start your string literal with u:

u"v\u00c2\u00a1lido"

Demo:

>>> u"v\u00c2\u00a1lido"
u'v\xc2\xa1lido'
>>> print u"v\u00c2\u00a1lido"
v¡lido
Martijn Pieters
  • 1,048,767
  • 296
  • 4,058
  • 3,343
0

I think json.dumps is what you need:

>>> s="válido"
>>> s
'v\xc3\x83\xc2\xa1lido'
>>> json.dumps(s)
'"v\\u00c3\\u00a1lido"'
>>> print json.dumps(s)
"v\u00c3\u00a1lido"

Maybe it's too late for the OP, but hope it can help guys who are trying to solve the same problem.

WKPlus
  • 6,955
  • 2
  • 35
  • 53