10

On my Windows box, I usually did this in python 2 to write a csv file:

import csv
f = open("out.csv","wb")
cr = csv.writer(f,delimiter=';')
cr.writerow(["a","b","c"])
f.close()

Now that python 3 forbids writing text files as binary, that piece of code does not work anymore. That works:

import csv
f = open("out.csv","w",newline='')
cr = csv.writer(f,delimiter=';')
cr.writerow(["a","b","c"])
f.close()

Problem is: newline parameter is unknown to Python 2.

Of course, omitting the newline results in a csv file with too many \r chars, so not acceptable.

I'm currently performing a backwards compatible process to progressively migrate from python 2 to python 3.5 There are a lot of those statements in all my modules.

My solution was embedding the code in a custom module, and the custom module returns file handler + writer object. A python version check is done inside the module, which allows any module using my module to work whatever python version without too much hacking.

Is there a better way?

Jean-François Fabre
  • 137,073
  • 23
  • 153
  • 219
  • 1
    I can't help but wonder what @Raymond Hettinger — Python core developer and creator of the csv module — would suggest... – martineau Oct 06 '16 at 12:46

2 Answers2

14

On Windows, I found a python 2 & 3 compliant way of doing it changing csv lineterminator option (which defaults to "\r\n" which makes one \r too many when file is open in text mode in Windows)

import csv

with open("out.csv","w") as f:
    cr = csv.writer(f,delimiter=";",lineterminator="\n")
    cr.writerow(["a","b","c"])
    cr.writerow(["d","e","f"])
    cr.writerow(["a","b","c"])
    cr.writerow(["d","e","f"])

Whatever the python version, that will create a csv file without the infamous "blank lines".

The only drawback is that on Linux, this method would produce \r-free files, which is maybe not the standard (although files still opens properly in excel, no blank lines and still several lines :))

the problem persists on 3.6.2 (Just checked myself like I should have some time ago)

An alternative is to use a dictionary as arguments:

write_args = {"mode":"wb"} if bytes is str else {"mode":"w","newline":""}

(comparing bytes to str is one of the many ways to tell python 2 from python 3, in python 3 types are different, and it's very related to our current problem BTW).

Now we can pass those arguments with args unpacking:

with open("out.csv",**write_args) as f:
    cr = csv.writer(f,delimiter=";")
Jean-François Fabre
  • 137,073
  • 23
  • 153
  • 219
  • Interesting, because the lines produced are still terminated by `"\r\n"` in the output file (on Windows in both Python 2 & 3). – martineau Oct 06 '16 at 12:38
  • of course, since when you write `\n` in text mode on windows it writes `\r\n`. BUT default line terminator is `\r\n` for csv (which is probably an issue) so writing this on windows does `\r\r\n` (file does not check if there's already a `\r`!). Yes you did well in pinging Raymond Hettinger because I'm sure the lineterminator thing would need an update in csv python 3 module. My vision: changing it to `\n` _only for windows platform_ would fix everything. – Jean-François Fabre Oct 06 '16 at 12:47
  • I just tried it (on Windows) in both Python 2 & 3, with and without a `lineterminator="\n"`, and there was never a `\r\r\n` produced — so I'm not sure I understand the issue to which you refer. – martineau Oct 06 '16 at 12:53
  • @martineau: what did you witness? without lineterminator, in both python 2 and 3 it should produce corrupt files (with 1 blank line after every data line). – Jean-François Fabre Oct 06 '16 at 12:57
  • In all cases each line was terminated with `\r\n`. I also changed the first `writerow` to `cr.writerow(["a","b\nx","c"])` and it changed the embedded newline too (and put quotes around the string: i.e. `a;"b\r\nx";c` was written to the file). – martineau Oct 06 '16 at 13:03
  • Just tried it and got the double `\r\r` chars. Maybe they have fixed it in python 3.5 or 3.6 or later 2.7.x versions. I only tested with 3.4 and 2.7.8. Which versions are you using? – Jean-François Fabre Oct 06 '16 at 13:09
  • I'm using versions 2.7.12 and 3.5.2. – martineau Oct 06 '16 at 13:14
  • No, this is not "fixed". I just tried in 2.7.13 and 3.6.2 on Windows, and both of them produce the `\r\r\n` line endings unless `lineterminator='\n'` is specified. Besides empirical observations, I don't think this is the kind of thing that the Python devs would have changed in Python 2.7 anyway, because it would alter functionality (breaking existing code that depends on this behavior) without fixing any security problems. – John Y Nov 17 '17 at 19:46
  • @JohnY you're right, the behaviour is the same on 3.6.2. Editing. So martineau must have made a bad test. – Jean-François Fabre Nov 17 '17 at 20:18
  • Jean-François Fabre: There's was no "bad test". I just tried it again on Windows, only this time with Python 2.7.14 and 3.6.3, and in both cases the lines in `out.csv` all end with `'\r\n'`. – martineau Nov 20 '17 at 03:58
  • @martineau another user commented about that too, so there's at least 2 of us. Did you downvote my answer for that particular reason? that wouldn't be very nice. I suggest we talk about that first, e.g. in a chatroom. – Jean-François Fabre Nov 20 '17 at 10:29
  • Jean-François Fabre: Yes, I down-voted your answer because on Windows it still didn't seem to work, but after further investigation I now see that my text/hex editor was interfering by doing automatic conversions from `'\n'` to `'\r\n'`. Sorry, however I can't undo it unless you make some change to your answer, so I suggest you do something trivial to it to lift the restriction. – martineau Nov 20 '17 at 17:03
  • @martineau edited to removed the striked text. you can undo now. – Jean-François Fabre Nov 20 '17 at 17:06
4

For both reading and writing csv files, I've found no better way either — however I would encapsulate into a separate function as shown below. The advantage being that the logic is all in one place instead of duplicated if it's needed more than once.

import csv
import sys

def open_csv(filename, mode='r'):
    """Open a csv file in proper mode depending on Python verion."""
    return(open(filename, mode=mode+'b') if sys.version_info[0] == 2 else
           open(filename, mode=mode, newline=''))

with open_csv('out.csv', 'w') as f:
    writer = csv.writer(f, delimiter=';')
    writer.writerow([1, 2, 3])
    writer.writerow(['a', 'b', 'c'])

The open_csv() utility could be simplified slightly by using the technique shown in @Jean-François Fabre's Dec 8, 2020 update to his answer to detect what version of Python is being used:

def open_csv(filename, mode='r'):
    """Open a csv file in proper mode depending on Python verion."""
    return(open(filename, mode=mode+'b') if bytes is str else
           open(filename, mode=mode, newline=''))
martineau
  • 119,623
  • 25
  • 170
  • 301