4

I've written some code that makes use of the Biopython Entrez wrapper. Code was working fine on my previous Win10 laptop (Python 3.5.1), but I've just ported the code to a new Win10 laptop with the same versions of every package and Python installed and I'm now getting a decode error.

The traceback error leads to a function that fetches text - it's attempting to decode the text using cp1252 when it should be using UTF-8. I know that similar questions have been asked, but none have dealt with this problem happening inside a package (Biopython in my case). Copying the UTF-8 encoding file in Python/lib and renaming it to cp1252.py solves the problem, but this obviously is not a long term solution.

File "C:\Users\arjun\AppData\Local\Programs\Python\Python35-32\lib\encodings\cp1252.py", line 23, in decode
    return codecs.charmap_decode(input,self.errors,decoding_table)[0]

UnicodeDecodeError: 'charmap' codec can't decode byte 0x81 in position 21715: character maps to <undefined>
  • I have similar issue. Did you manage to solve it ? I checked for answers and apparently python picks up the default system encoding. And this command should have set the right encoding: chcp 65001 But it didn't work in my case. I used the approach mentioned here : http://stackoverflow.com/questions/9233027/unicodedecodeerror-charmap-codec-cant-decode-byte-x-in-position-y-character That worked in my case. –  Nov 28 '16 at 14:45

1 Answers1

1

Use the io module for reading if you're using Python 3.x (https://docs.python.org/2/library/io.html#io.open). By default, it will use the encoding specified on its running platform. You can also specify your own encoding as explained in the docs.

atjua
  • 541
  • 1
  • 9
  • 18