3

let's say I have a string like this

ab, "cd
ef", gh, "ijk, lm" 

and this

a,b,c
d,e,f

and I want to parse them with python csv module. how can i do it? Second one is assumed it's two lines but first one is not.

I thought they'll needed to be loaded into csv.reader() so first thought I'll need to divide them by a comma so used .split(',') but it would have a problem on the second string, as it'll ignore the newline and I also thought of .splitline() but in this case it'll mess up the first string..

been trying to solve this for a whole day and i'm out of ideas... any help?

MrSolid51
  • 307
  • 1
  • 7
  • 17

1 Answers1

3

The issue you are having is that you have a space after the , so your actual delimiter is ', ' in the first example.

Luckily, you are not the first with this issue. Use csv.skipinitialspace set to True to solve.

Given:

$ cat file1.csv
ab, "cd
ef", gh, "ijk, lm"

And:

$ cat file2.csv
a,b,c
d,e,f

You can do:

with open('file1.csv', 'r') as f:
    for row in csv.reader(f, quotechar='"',skipinitialspace=True):
        print(f"len: {len(row)}, row: {row}")

Prints:

len: 4, row: ['ab', 'cd\nef', 'gh', 'ijk, lm']

And the same dialect works for the second example that has a true , delimiter without the trailing space:

with open('file2.csv', 'r') as f:
    for row in csv.reader(f, quotechar='"',skipinitialspace=True):
        print(f"len: {len(row)}, row: {row}")

Prints:

len: 3, row: ['a', 'b', 'c']
len: 3, row: ['d', 'e', 'f']
dawg
  • 98,345
  • 23
  • 131
  • 206
  • thank you, this helped. However... what if the input was a raw string instead of the csv file itself? – MrSolid51 Sep 22 '20 at 21:20
  • You can use [io](https://docs.python.org/3/library/io.html#io.StringIO) or [StringIO](https://docs.python.org/2/library/stringio.html#module-StringIO) (depending on your Python version) to allow the csv library to treat a string as a file. – dawg Sep 22 '20 at 23:55
  • [Here](https://stackoverflow.com/a/9157370/298607) is Tim Pietzcker's excellent example of using these libraries with the csv module. – dawg Sep 23 '20 at 01:47