0

Currently, I want to parse a csv file which has 4 items per line and separate by comma. For example:

1, "2,3", 4, 5

How can I split it into :

[1,"2,3",4,5]

I try to use csv.reader, but the outcome still in wrong way. Can anyone help? THX!

Ashwini Chaudhary
  • 244,495
  • 58
  • 464
  • 504
user2171526
  • 55
  • 1
  • 8
  • 10
    How did you try to use `csv.reader`, and how was the result wrong? – Ry- Apr 20 '13 at 18:25
  • See http://stackoverflow.com/questions/2785755/how-to-split-but-ignore-separators-in-quoted-strings-in-python?rq=1 – jarmod Apr 20 '13 at 18:33

2 Answers2

2

csv.reader will not do type conversion, but something like this perhaps:

In [1]: import csv

In [2]: data = ['1, "2,3", 4, 5']

In [3]: next(csv.reader(data, skipinitialspace=True))
Out[3]: ['1', '2,3', '4', '5']
root
  • 76,608
  • 25
  • 108
  • 120
  • If the OP does want the "natural" type conversion, either `list(ast.literal_eval(line))` or `json.loads('[{}]'.format(line))` should work. – DSM Apr 20 '13 at 18:52
  • @DSM - You are right. At the moment I answered more on a hunch that OP forgot the quotes :) – root Apr 20 '13 at 18:55
0
"""
[xxx.csv]
1, "2,3", 4, 5
"""

import re
f = open("xxx.csv")
line = f.readline() # line = '1, "2,3", 4, 5'
startUnit = False # " is for start or end
token = ""
answer=[]
for i in line:
    if startUnit==False and re.match("[0-9]", i):
        answer.append(int(i))
    elif i=='"':
        if startUnit==True:
            answer.append(token)
        startUnit = not startUnit
    elif startUnit==True:
        token+=i
    elif startUnit==False:
        token=""

print answer

This is simple example. It can make other exceptions because the code is only for your example. (1, "2,3", 4, 5) I hope it is helpful for you

Park
  • 2,446
  • 1
  • 16
  • 25