How to convert a text in unicode to ascii in python without using regular expression substitution?

Asked Feb 20 '15 at 04:58

Active Feb 20 '15 at 04:58

Viewed 290 times

How do convert x into y in the following situation?

In [64]: x = '\xe2\x80\x9c random texts.\xe2\x80\x9d '

In [65]: print x
“ random texts.”

In [66]: y = '"random texts."'

In [67]: print y
"random texts."

The question is this: I have a text that has some utf-8 strings, and I want to covert this text into ascii. I will have a table of conversion rules such as

\xe2\x80\x9c : "
\xe2\x80\x9d : "

My first instinct is using regular expression substitution, but I was wondering if there is more pythonic or proper way to achieve this sort of task

asked Feb 20 '15 at 04:58

Alby

5,522
7
41
51

possible duplicate of [Where is Python's "best ASCII for this Unicode" database?](http://stackoverflow.com/questions/816285/where-is-pythons-best-ascii-for-this-unicode-database) – Zero Piraeus Feb 20 '15 at 05:09
@ZeroPiraeus Thanks for the link, but when I tried, the results are not satisfactory:`print unidecode(u"\xe2\x80\x9c") # prints a` and `print unidecode(u"\xe2\x80\x9d") # prints a` – Alby Feb 20 '15 at 05:18
1

`"\xe2\x80\x9d"` is UTF8, not unicode - you'll need to decode it first: `print unidecode("\xe2\x80\x9d".decode("utf8"))` – Zero Piraeus Feb 20 '15 at 05:21

How to convert a text in unicode to ascii in python without using regular expression substitution?

0 Answers0