2

How to convert from u'\\u795d\\u798f' to u'\u795d\u798f'?

I'm quite confused....

u'\u795d\u798f' is 祝福 in Chinese. Thanks~

UPDATE: I'm sorry, I didn't know how to express it at the beginning. Now, my problem is :I got u'\\u795d\\u798f' and I want it to be u'\u795d\u798f'.

Kinka
  • 425
  • 6
  • 15
  • 2
    What are you trying to achieve? What problem are you seeing? – Anders Lindahl Jun 07 '12 at 08:14
  • Are you trying to convert something to itself? use `=` and you will get it. – Mayli Jun 07 '12 at 08:34
  • There are multiple potential formats with `\u` escapes in and each has slightly different rules. If it's a JSON string you should be using a JSON decoder instead of trying to convert the string by hand. How are other escaped characters handled in there? ie do you have `\n` or `\\n`? – bobince Jun 08 '12 at 14:14
  • I just passed a string into whoosh.qparser.QueryParser and parse it, and then I got a Query instance conisits of Terms, and I tried to print them out in the form `term[1]`, but failed. That's the problem~ – Kinka Jun 10 '12 at 11:08

1 Answers1

4

From your title (but not the question text) it looks like the problem is that the backslashes in the strings are escaped (i.e., you have \\u795d and you want \u795d). There are several questions on this issue (like Process escape sequences in a string in Python).

In python 2, you can do:

>>> u'\\u795d\\u798f'.decode('unicode_escape')
u'\u795d\u798f'

Applying the print statement to this should print the Chinese characters.

The python 3 equivalent is:

>>> bytes('\\u795d\\u798f','utf-8').decode('unicode_escape')
'祝福'
Community
  • 1
  • 1
James
  • 3,191
  • 1
  • 23
  • 39