You can use shlex.split
, handy for parsing quoted strings:
>>> import shlex
>>> text = 'This is "a simple" test'
>>> shlex.split(text, posix=False)
['This', 'is', '"a simple"', 'test']
Doing this in non-posix mode prevents the removal of the inner quotes from the split result. posix
is set to True
by default:
>>> shlex.split(text)
['This', 'is', 'a simple', 'test']
If you have multiple lines of this type of text or you're reading from a stream, you can split efficiently (excluding the quotes in the output) using csv.reader
:
import io
import csv
s = io.StringIO(text.decode('utf8')) # in-memory streaming
f = csv.reader(s, delimiter=' ', quotechar='"')
print(list(f))
# [['This', 'is', 'a simple', 'test']]
If on Python 3, you won't need to decode the string to unicode as all strings are already unicode.