0

I have an input csv with a variable number of columns I'm trying to pull into a list. My test is parsing the input csv and creating a list with extra elements around the csv columns. What I would like to see is a list that contains only the csv elements, and I'm getting empty quoted elements as well. I need some help understanding what options to the csv reader I'm missing.

Example output:

$ python cond.py
opening conditions file  conditions.lst
parser  0  input line:
"string1:", "string1b,string1c,"
output list elements:
['string1:']
['', '']
['']
['string1b,string1c,']
[]

parser  1  input line:
"stringa:", "stringb,stringc,"
output list elements:
['stringa:']
['', '']
['']
['stringb,stringc,']
[]

parser  2  input line:
"string3:", "string3next=abc", "string3b","string3c:", "string3d"
output list elements:
['string3:']
['', '']
['']
['string3next=abc']
['', '']
['']
['string3b']
['', '']
['string3c:']
['', '']
['']
['string3d']
[]

Input file:

$ cat conditions.lst
"string1:", "string1b,string1c,"
"stringa:", "stringb,stringc,"
"string3:", "string3next=abc", "string3b","string3c:", "string3d"

Python cond.py file:

$ cat cond.py

from __future__ import print_function
#from csv import reader

import re
import sys
import csv

# variables

conditionsFile = "conditions.lst"
parserConditions = []
numOfParsers = 0


print("opening conditions file ", conditionsFile)
with open(conditionsFile, "r") as cf:
  for line in cf:
    print("parser ", numOfParsers, " input line:")
    print(line.strip())

    r = csv.reader(line, quotechar='"', delimiter=',', quoting=csv.QUOTE_ALL, skipinitialspace=True)
    print("output list elements:")
    for cline in  r:
      print(cline)

    numOfParsers = numOfParsers + 1
    print("")

  print("total number of parsers: ", numOfParsers)

Update: Using help from @Jean-FrançoisFabre I haven't solved the root reason but have got a workaround - I put the csv elements into a list then remove the blank elements.

for cline in  r:
  conditions.extend(cline)

conditions = filter(None, conditions)
print(conditions)
  • because you've written your csv with "w" mode not "wb" – Jean-François Fabre Dec 05 '18 at 20:23
  • Possible duplicate of [CSV file written with Python has blank lines between each row](https://stackoverflow.com/questions/3348460/csv-file-written-with-python-has-blank-lines-between-each-row) – Jean-François Fabre Dec 05 '18 at 20:24
  • @Jean-FrançoisFabre thanks - that's a valuable hint. The csv is generated by another Linux process, is there a way I should be reading it in my python script? – Allen Pomeroy Dec 05 '18 at 20:28
  • well, you could just ignore the blank rows. – Jean-François Fabre Dec 05 '18 at 20:28
  • @Jean-FrançoisFabre .. immediately I can do that, just add a check for elements that are blank and skip them, but since the input file only has \n separating the lines, I'm still confused at the csv reader behavior (did an od -c on the input file) – Allen Pomeroy Dec 05 '18 at 20:36

1 Answers1

0

csv.reader takes a file-like object not a string...so it is iterating strangely over the characters of a line instead of the lines of a file. You just need:

from __future__ import print_function
import csv

with open('conditions.lst','rb') as cf:
    r = csv.reader(cf,skipinitialspace=True)
    for line in r:
        print(line)

Output:

['string1:', 'string1b,string1c,']
['stringa:', 'stringb,stringc,']
['string3:', 'string3next=abc', 'string3b', 'string3c:', 'string3d']
Mark Tolonen
  • 166,664
  • 26
  • 169
  • 251