0

How do convert x into y in the following situation?

In [64]: x = '\xe2\x80\x9c random texts.\xe2\x80\x9d '

In [65]: print x
“ random texts.”

In [66]: y = '"random texts."'

In [67]: print y
"random texts."

The question is this: I have a text that has some utf-8 strings, and I want to covert this text into ascii. I will have a table of conversion rules such as

\xe2\x80\x9c : "
\xe2\x80\x9d : "

My first instinct is using regular expression substitution, but I was wondering if there is more pythonic or proper way to achieve this sort of task

Alby
  • 5,522
  • 7
  • 41
  • 51
  • possible duplicate of [Where is Python's "best ASCII for this Unicode" database?](http://stackoverflow.com/questions/816285/where-is-pythons-best-ascii-for-this-unicode-database) – Zero Piraeus Feb 20 '15 at 05:09
  • @ZeroPiraeus Thanks for the link, but when I tried, the results are not satisfactory:`print unidecode(u"\xe2\x80\x9c") # prints a` and `print unidecode(u"\xe2\x80\x9d") # prints a` – Alby Feb 20 '15 at 05:18
  • 1
    `"\xe2\x80\x9d"` is UTF8, not unicode - you'll need to decode it first: `print unidecode("\xe2\x80\x9d".decode("utf8"))` – Zero Piraeus Feb 20 '15 at 05:21

0 Answers0