8

I have been using python 2.6. While I was writing a python program to process the query result ( in csv format ) from sql server. I found it does not support unicode.

When I run the program with csv file, a error poped up saying:

    for row in csvReader:
Error: line contains NULL byte

After I save the csv file in ANSI/ASCII format with Ultraedit, the program is running okay.

I tried to include the encoding option, but it failed:

csvReader = csv.reader(open(fname, mode='rb', encoding='unicode'), delimiter=',')
TypeError: 'encoding' is an invalid keyword argument for this function

csvReader = csv.reader(open(fname, mode='rb', encoding='utf-8'), delimiter=',')
TypeError: 'encoding' is an invalid keyword argument for this function

I wonder if python 3 support this unicode reading. It can save me a lot of work.

lamwaiman1988
  • 3,729
  • 15
  • 55
  • 87

2 Answers2

7

Python 3 definitely supports unicode. My guess is that you specified the wrong (or no?) encoding when you opened the CSV file for reading. See: http://docs.python.org/release/3.1.3/library/functions.html#open

And try something like:

reader = csv.reader(open("foo.csv", encoding="utf-8"))

Edit: If you are using Python 2.6, you can achieve the same result with:

import codecs
reader = csv.reader(codecs.open("foo.csv", encoding="utf-8"))

HOWEVER if you're getting null bytes, your file may be encoded using "utf-16", so try that if the file can't be decoded using utf-8.

David Wolever
  • 148,955
  • 89
  • 346
  • 502
2

A similar question is already answered Python CSV error: line contains NULL byte

Also, Try to open it in 'rb' mode instead of 'rU'

Community
  • 1
  • 1
jerrymouse
  • 16,964
  • 16
  • 76
  • 97