26

I am trying to write python 2/3 compatible code to write strings to csv file object. This code:

line_as_list = [line.encode() for line in line_as_list]
writer_file =  io.BytesIO()
writer = csv.writer(writer_file, dialect=dialect, delimiter=self.delimiter)
for line in line_as_list:
    assert isinstance(line,bytes)
    writer.writerow(line)

Gives this error on Python3:

>           writer.writerow(line)
E           TypeError: a bytes-like object is required, not 'str'

But assert has no problem with the type, so why is csv creating an error?

Can't I use BytesIO only for both Python 2 and 3? Where is the problem here?

goelakash
  • 2,502
  • 4
  • 40
  • 56
  • @tdelaney What I meant was I am not sure whether StringIO and BytesIO will give the same representation for source text (probably in `utf-8`). Thats why I am trying to use the same output object type. – goelakash Jun 22 '16 at 17:10

1 Answers1

41

In Python3 csv.writer expects a file-like object opened in text mode. In Python2, csv.writer expects a file-like object opened in binary mode.

Therefore, in Python3, use io.StringIO, while in Python2 use io.BytesIO:

import io
import csv
import sys
PY3 = sys.version_info[0] == 3

line_as_list = [u'foo', u'bar']
encoding = 'utf-8'

if PY3:
    writer_file =  io.StringIO()
else:
    writer_file =  io.BytesIO()
    line_as_list = [line.encode(encoding) for line in line_as_list]

writer = csv.writer(writer_file, dialect='excel', delimiter=',')
writer.writerow(line_as_list)
content = writer_file.getvalue()

if PY3:
    content = content.encode(encoding)

print(type(content))
print(repr(content))

In Python3 the code above prints

<class 'bytes'>
b'foo,bar\r\n'

In Python2 the code above prints

<type 'str'>
'foo,bar\r\n'
unutbu
  • 842,883
  • 184
  • 1,785
  • 1,677
  • That's a good workaround, but any idea why the error asks for 'bytes', when str *is* a byte format? – goelakash Jun 24 '16 at 13:23
  • I believe that error is coming from the `BytesIO` object -- it is complaining that it was passed a `str` when it expected `bytes`. In Python3 a `str` is not a "byte format". A unicode `str` is a sequence of code points. – unutbu Jun 24 '16 at 14:42
  • But I passed a str.encode() object, effectively a bytes object. Then where is the problem? This error says that `str` was passed, when it wasn't (just talking about Python 3). – goelakash Jun 24 '16 at 16:09
  • I'm not able to reproduce the error you posted so this is just a guess. What is `self.delimiter`? Could it have been a `str`? – unutbu Jun 24 '16 at 17:47
  • 1
    Yeah, that may be it, though after encoding the delimiter it says that 'the delimiter must be string, not bytes'. – goelakash Jun 24 '16 at 22:51
  • In Python3 the `csv` module wants unicode `str`s. In Python2 the `csv` module wants byte `str`s. – unutbu Jun 25 '16 at 00:08
  • Yeah, the `delimiter` *has* to be `unicode` in Python3, and so it doesn't work with `bytes` at all. – goelakash Jun 28 '16 at 08:38