I'm using the library MySQLdb
for Python to access a database with entries in Portuguese, with a bunch of accents, which I then save to an Excel file using xlsxwriter
. When I'm closing the workbook to save it, I get the following error:
UnicodeDecodeError: 'utf8' codec can't decode byte 0xed in position 59: invalid continuation byte
The result it's complaining about is:
u'QNO XX Conjunto YY, No. Casa ZZ, CEP: AAAAAAAA, Bras\xedlia /DF'
In specific, it should be Brasília
instead of Bras\xedlia
. How can I get the outputs to be encoded in a friendlier way? Do I have to replace \xed
and the like with each possible accent individually?
--EDIT:
I know 0xED is í
in latin-1
(iso-8859-1
), and given the language (and information from the people in charge of the db) I think that's the right encoding. How do I turn a string that goes 'Bras\xedlia'
into one that goes 'Brasília'
in general, knowing that?
--EDIT:
If I try to use str(that thing)
what I get is
'ascii' codec can't encode character u'\xed' in position 52: ordinal not in range(128)