0

I am trying to download a webpage (in Russian) using mechanize module in python (My computer uses only English) . I get the following error

UnicodeEncodeError: 'ascii' codec can't encode characters in position 50-59

Can somebody tell me how to correct these type of errors or what they mean?

UnadulteratedImagination
  • 1,971
  • 2
  • 13
  • 15
  • Please show us your code. And the URL, or at least the headers and `` node. And tell us whether you're using Python 2.x or 3.x, because the answer will be different. But the basic idea is that you have to use the right codec to decode the bytes to Unicode, instead of `'ascii'`. Whether's that's UTF-8 or some Windows Russian codepage, you should be able to tell from the data. – abarnert Mar 21 '13 at 20:59

1 Answers1

0

Long story short, your original string is not encoded in ASCII meaning that when trying to print the characters python doesn't know what to do because the original character code is out of the ASCII scope.

Here's the ASCII table and what characters it supports: http://www.asciitable.com/

You can convert your characters using say: Python - Encoding string - Swedish Letters


Or you can do:

(This is a solution to a lot of problems encoding wise)


Edit: C:\Python??\Lib\Site.py Replace "del sys.setdefaultencoding" with "pass" like so: Like so

Then,
Put this in the top of your code:

sys.setdefaultencoding('latin-1')

The holy grail of fixing the Swedish/non-UTF8 compatible characters. I'm not sure that latin-1 will cover all your russian characters, if it doesn't you probably know of a encoding which does (example: ISO-8859-15 or something)

Community
  • 1
  • 1
Torxed
  • 22,866
  • 14
  • 82
  • 131