1

I am new to Python and I have a string split problem which I need help

input = "\"filename.txt\", 1234,8973,\"Some Description \""

input contains both strings & numbers, and there might be cases where leading and tailing spaces exist

expected output should be

['filename.txt', '1234', '8973', 'Some Description']

Can split do the job or i need regular expressions?

Dennis
  • 3,528
  • 4
  • 28
  • 40

7 Answers7

8

Use the csv module to handle input like that; it handles quoting, can be taught about leading whitespace, and trailing whitespace can be removed afterwards:

import csv

reader = csv.reader(inputstring.splitlines(), skipinitialspace=True)
row = next(reader)  # get just the first row
res = [c.strip() for c in row]

Demo:

>>> import csv
>>> inputstring = '"filename.txt", 1234,8973,"Some Description "'
>>> reader = csv.reader(inputstring.splitlines(), skipinitialspace=True)
>>> row = next(reader)
>>> [c.strip() for c in row]
['filename.txt', '1234', '8973', 'Some Description']

This has the added advantage that you can have commas in the values, provided they are quoted:

>>> with_commas = '"Hello, world!", "One for the money, two for the show"'
>>> reader = csv.reader(with_commas.splitlines(), skipinitialspace=True)
>>> [c.strip() for c in next(reader)]
['Hello, world!', 'One for the money, two for the show']

The csv.reader() object takes an iterable as the first argument; I used the str.splitlines() method to turn a (potentially multiline) string into a list; you could also just use [inputstring] if your input string is always just one line.

Martijn Pieters
  • 1,048,767
  • 296
  • 4,058
  • 3,343
2
>>> [s.strip(' "') for s in input.split(',')]
['filename.txt', '1234', '8973', 'Some Description']

If it's guaranteed that no commas appear within your quoted parts, this will do.

deceze
  • 510,633
  • 85
  • 743
  • 889
1

You can use Python map and lambda for this..

In [9]: input = "\"filename.txt\", 1234,8973,\"Some Description \""
In [11]: input = map(lambda x: x.strip(), input.split(','))
In [14]: input = map(lambda x: x.strip('"'), input)
In [16]: input = map(lambda x: x.strip(), input)
In [17]: input
Out[17]: ['filename.txt', '1234', '8973', 'Some Description']
Anish Shah
  • 7,669
  • 8
  • 29
  • 40
0

You could do like

li=input.split(",")

this will fetch you

li=['"filename.txt"', ' 1234', '8973', '"Some Description "']

then use ltrim and rtrim accordingly see How to trim whitespace (including tabs)? for more details

Community
  • 1
  • 1
therealprashant
  • 701
  • 15
  • 27
0

Use the re module.

Try the following code:

import re
filter(lambda x: x != '', re.split(",| |\"", input))

The output will be:

['filename.txt', '1234', '8973', 'Some', 'Description']
abcdef
  • 137
  • 1
  • 8
0
map(str.strip, input.replace('"','').split(','))
aadarshsg
  • 2,069
  • 16
  • 25
0

you can just using eval() function

input = "\"filename.txt\", 1234,8973,\"Some Description \""
>>> list(eval(input))
['filename.txt', 1234, 8973, 'Some Description ']