16

EDIT: I put it in the title, but just realized I didn't mention it in the body. This seems to be specific to Windows.

I'm having a hard time writing output using the csv Python module in a script that works with both Python 2.7 and 3.3.

First try which works as expected in Python 2.7:

with open('test.csv', 'wb') as csv_file:
    writer = csv.DictWriter(csv_file, ['header1', 'header2'])
    writer.writeheader()
    for item in items:
        writer.writerow(item)

However, when that same thing is run in Python 3.3 you wind up with:

TypeError: 'str' does not support the buffer interface

So I change 'wb' to 'wt' and it runs, but now I have an extra blank row every other line in the file.

To fix that, I change:

with open('test.csv', 'wt') as csv_file:

to:

with open('test.csv', 'wt', newline='') as csv_file:

But now, it breaks Python 2.7:

TypeError: 'newline' is an invalid keyword argument for this function

I know I could just do something like:

try:
    with open('test.csv', 'wt', newline='') as csv_file:
        writer = csv.DictWriter(csv_file, ['header1', 'header2'])
        writer.writeheader()
        for item in items:
            writer.writerow(item)
except TypeError:
    with open('test.csv', 'wb') as csv_file:
        writer = csv.DictWriter(csv_file, ['header1', 'header2'])
        writer.writeheader()
        for item in items:
            writer.writerow(item)

However, that has some seriously bad duplication.

Does anyone have a cleaner way of doing this?

EDIT: The test data is simple and has no newlines or anything:

items = [{'header1': 'value', 'header2': 'value2'},
         {'header1': 'blah1', 'header2': 'blah2'}]
Tamerz
  • 897
  • 1
  • 10
  • 25
  • Can't you just use `'w'` instead of `'wb'` or `'wt'`? – nathancahill Apr 24 '15 at 07:29
  • Are the strings in your `items` list `unicode` strings when you're running the script in Python 2? Are the values always ASCII, or could they include extra characters that need to be encoded? Even if you're able to run the same code under both versions of Python, you might not get the same results! – Blckknght Apr 24 '15 at 08:21
  • @Blckknght - I added the test data to the bottom of the question. It is just ASCII text. – Tamerz Apr 24 '15 at 08:32

2 Answers2

8

I've tried a few ways. As far as I can see, simple using 'w' could be a solution:

with open('test.csv', 'w') as csv_file:
    writer = csv.DictWriter(csv_file, fieldnames=['header1', 'header2'], lineterminator='\n')
    # write something
ljk321
  • 16,242
  • 7
  • 48
  • 60
  • 1
    If I do that, I still get blank lines every other row. Did you try this in Windows or on something else? – Tamerz Apr 24 '15 at 08:00
  • @Tamerz You get extra new lines because you've got extra new lines in your data... `.strip()` could be what you need. – gboffi Apr 24 '15 at 08:02
  • @Tamerz I tried with some fake data and it turned out good. So I'm thinking there is something wrong with your data, too. – ljk321 Apr 24 '15 at 08:07
  • `w` is the same as `wt` – cdarke Apr 24 '15 at 08:12
  • @skyline75489 - Added the test data to original question. You can see it is nothing but a couple strings in a dict. – Tamerz Apr 24 '15 at 08:14
  • @Tamerz Under what circumstance will this blank lines thing happen? Python 2 or 3? – ljk321 Apr 24 '15 at 08:32
  • I've tried on Mac OS X and Windows 8, with both python2 and python3. All turned out to be good. – ljk321 Apr 24 '15 at 08:35
  • @skyline75489 - The blank lines happen when using Python 2.7. There are a ton of StackOverflow questions about that in particular, such as [this one](http://stackoverflow.com/questions/8746908/why-does-csv-file-contain-a-blank-line-in-between-each-data-line-when-outputting) but not many about how to get around it in both 2.x and 3.x like my question is asking. – Tamerz Apr 24 '15 at 08:42
  • @Tamerz I open the csv file with notapad++ and I finally see the blank lines you were talking about. Using `lineterminator` seems to fix this, like in my new edit. – ljk321 Apr 24 '15 at 08:48
  • @skyline75489 - That did the trick for sure. I completely missed that keyword argument even existing. Thank you. – Tamerz Apr 24 '15 at 08:58
  • @Tamerz, See [this answer](http://stackoverflow.com/a/3348664/235698) for why you get duplicate lines. – Mark Tolonen Apr 24 '15 at 09:28
8

Here's a simpler generic way:

import sys

if sys.version_info[0] == 2:  # Not named on 2.6
    access = 'wb'
    kwargs = {}
else:
    access = 'wt'
    kwargs = {'newline':''}

with open('test.csv', access, **kwargs) as csv_file:
    writer = csv.DictWriter(csv_file, ['header1', 'header2'])
    writer.writeheader()
    for item in items:
        writer.writerow(item)

The principle here is not to try to fight the differences between Python 2 and 3 but to have conditional code. You can only go so far in writing code without this kind of test, sooner or later you will have to test the Python version.

cdarke
  • 42,728
  • 8
  • 80
  • 84
  • I thought maybe getting `**kwargs` involved may be a good solution. It still is not pretty but is significantly better than all the duplication I had. This will absolutely work in my scenario. Thank you. – Tamerz Apr 24 '15 at 08:38
  • I went with the answer that @skyline75489 gave but I still like this for future use. There are times I've needed to do exactly this but didn't know the best way. – Tamerz Apr 24 '15 at 08:59
  • 1
    @Tamerz: Check out [my answer](http://stackoverflow.com/a/41913382/355230) to a similar question. It works in both versions of Python and handles opening files for both reading and writing (plus. like `open()`, defaults to read mode if one wasn't explicitly specified). It also doesn't require the use of global variables. – martineau Jan 28 '17 at 19:13