2

I have to split a string into list. The input is as follows:

data = '''00402,
"0042 01,5",5
0042 02,3
"0042 02,5",1
"0042 05,5",4
"0042 05,5X05,5",7'''

The expected output is as follows:

['00402'],['0042 01,5', '5'],['0042 02', '3'],['0042 02,5', '1'],['0042 05,5', '4'],['0042 05,5X05,5', '7']

What I have tried to do so far is here:

temp_lines = filter(lambda x: x != '', data.split('\n'))
lines = []
for line in temp_lines:
    lines.append(re.split(';|,|\*|\t', line.replace("\r", "")))

print lines

This has not produced the required output. Please help with this.

Sнаđошƒаӽ
  • 16,753
  • 12
  • 73
  • 90
Jothimani
  • 137
  • 1
  • 10

1 Answers1

4

The csv module can help you here:

>>> import csv
>>> data = '''00402,
... "0042 01,5",5
... 0042 02,3
... "0042 02,5",1
... "0042 05,5",4
... "0042 05,5X05,5",7'''
>>> result = list(csv.reader(data.splitlines()))
>>> result
[['00402', ''], ['0042 01,5', '5'], ['0042 02', '3'], ['0042 02,5', '1'], ['0042 05,5', '4'], ['0042 05,5X05,5', '7']]

The only problem is the empty string in the first sublist because the first line of data has a weird format not following the format of the other lines. If empty fields are bothering you, filter them out:

>>> [[x for x in sub if x] for sub in result]
[['00402'], ['0042 01,5', '5'], ['0042 02', '3'], ['0042 02,5', '1'], ['0042 05,5', '4'], ['0042 05,5X05,5', '7']]
timgeb
  • 76,762
  • 20
  • 123
  • 145