2

I am using Python (Tornado framework in specific) to develop a corp website. When I output the data to html, there were some special characters such as (r) or (tm) in unicode format, for example:

Google%u2122

How can I encode (or convert) it to the correct one, such as Google (TM). I try using encode() method but it didn't work.

Thank you very mych

mrblue
  • 807
  • 2
  • 12
  • 24
  • 2
    possible duplicate of [How to unquote a urlencoded unicode string in python?](http://stackoverflow.com/questions/300445/how-to-unquote-a-urlencoded-unicode-string-in-python) – Ikke Feb 29 '12 at 09:53
  • how you get the value `Google%u2122`? because when i try `print u"Google\u2122"` i get `Google™`. but `%u` is not working. – Nilesh Feb 29 '12 at 10:56
  • Hi Ikke, thank for your link, I got the question from there. I will post details in the answer below. – mrblue Feb 29 '12 at 11:24
  • `%u2122` is a JavaScript `escape()` encoding. It's not legitimately used in any other context (specifically it's *not* the same as standard URL encoding) and if you find one in your database that's a strong data integrity smell. Where is this data coming from? – bobince Mar 01 '12 at 00:20

1 Answers1

1

Thanks Ikke for the link, as I am working in web environment (using Tornado as I mentioned) so the primary answer there didn't meet my requirement, instead this one https://stackoverflow.com/a/300556/183846:

from urllib import unquote

def unquote_u(source):
    result = unquote(source)
    if '%u' in result:
        result = result.replace('%u','\\u').decode('unicode_escape')
    return result

print unquote_u('Tan%u0131m')

To apply to Tornado template, I create the f

Community
  • 1
  • 1
mrblue
  • 807
  • 2
  • 12
  • 24