The file contains one or more NULL bytes which is not compatible with the CSV reader. As a workaround, you could read the file a line at time and if a NULL byte is detected, replace it will a space character. The resulting line could then by parsed by a CSV reader by converting the resulting string into a file like object. Note, the delimiter by default is ,
so it does not need to be specified. By adding enumerate()
, you could then display which lines in your file contain the NULL bytes.
As you are using a DictReader()
, an extra step is needed to first extract the header from your file using a normal csv.reader()
. This row can be used to manually specify the fieldnames
parameter to your DictReader
.
import csv
import StringIO
with open('data.csv', 'rb') as f_input:
# Use a normal CSV reader to get the header line
header = next(csv.reader(f_input))
for line_number, raw_line in enumerate(f_input, start=1):
if '\x00' in raw_line:
print "Line {} - NULL found".format(line_number)
raw_line = raw_line.replace('\x00', ' ')
row = next(csv.DictReader(StringIO.StringIO(raw_line), fieldnames=header))
print row
Lastly, when using a csv.reader()
, you should open the file in binary mode e.g. rb
.