Printing unicode characters from SQL in Excel with xlwt

Question

I am using Python to extract data from an MSSQL database, using an ODBC connection. I am then trying to put the extracted data into an Excel file, using xlwt.

However this generates the following error:

UnicodeDecodeError: 'ascii' codec can't decode byte 0xd8 in position 20: ordinal not in range(128)

I have run the script to just print the data and established that the offending character in the database is an O with a slash through it. On the python print it shows as "\xd8".

The worksheet encoding for xlwt is set as UTF-8.

Is there any way to have this come straight through into Excel?

Edit

Full error message below:

C:\>python dbtest1.py
Traceback (most recent call last):
  File "dbtest1.py", line 24, in <module>
    ws.write(i,j,item)
  File "build\bdist.win32\egg\xlwt\Worksheet.py", line 1032, in write
  File "build\bdist.win32\egg\xlwt\Row.py", line 240, in write
  File "build\bdist.win32\egg\xlwt\Workbook.py", line 309, in add_str
  File "build\bdist.win32\egg\xlwt\BIFFRecords.py", line 25, in add_str
  File "C:\Python27\lib\encodings\utf_8.py", line 16, in decode 
    return codecs.utf_8_decode(input, errors, True)
UnicodeDecodeError: 'utf8' codec can't decode byte 0xd8 in position 20: invalid
continuation byte

What you call the worksheet encoding is the encoding used by xlwt to DEcode incoming str objects. The default is `ascii`. Your error message is inconsistent with "is set as UTF-8". Please edit your question to show the traceback that is associated with the error message. — John Machin, Jul 22 '12 at 22:54
So the error message changed. Perhaps you were mistaken about what encoding you were using initially. In any case: you don't have latin1. Use cp1252 or similar, depending on your locale. Even better: persuade pyODBC to provide all text output as Python `unicode` objects. — John Machin, Jul 24 '12 at 12:23
Yes - the top one was leaving it blank (so default ascii) and the second one was using UTF-8, I must have got a bit muddled somewhere. Thanks for your help - I'll have a look at pyODBC/a different method. — RowerAl, Jul 24 '12 at 12:41

score 4 · Answer 1 · answered Jul 19 '12 at 09:10

4

Setting the workbook encoding to 'latin-1' seems to have achieved the same:

wb = xlwt.Workbook(encoding='latin-1')

(It was set at 'UTF-8' before)

The other answer didn't work in my case as there were other fields that were not strings.

answered Jul 19 '12 at 09:10

RowerAl

51
1
8

score -1 · Answer 2 · answered Jul 18 '12 at 14:29

-1

The SQL extraction seems to be returning strings encoded using ascii. You can convert them to unicode with:

data = unicode(input_string, 'latin-1')

You can then put them into a spreadsheet with xlwt.

answered Jul 18 '12 at 14:29

xorsyst

7,897
5
39
58

Printing unicode characters from SQL in Excel with xlwt

2 Answers2