-1

I want to split text into list, where file name with spaces should be treated as a single item: example

s = 'cmd -a -b -c "file with spaces.mp4" -e -f'.split()
print(s)

output:

['cmd', '-a', '-b', '-c', '"file', 'with', 'spaces.mp4"', '-e', '-f']

desired output:

['cmd', '-a', '-b', '-c', '"file with spaces.mp4"', '-e', '-f']

I tried using some for loops but it gets nasty, is there a decent way using regex or anything else which doesn't looks ugly

Mahmoud Elshahat
  • 1,873
  • 10
  • 24

4 Answers4

4

Actually, in this case I won't use regex. This is what shlex.split() is for:

import shlex

s = shlex.split( 'cmd -a -b -c "file with spaces.mp4" -e -f' )
print(s)

Prints:

['cmd', '-a', '-b', '-c', 'file with spaces.mp4', '-e', '-f']
Andrej Kesely
  • 168,389
  • 15
  • 48
  • 91
4

Try shlex

import shlex

data=('cmd -a -b -c "file with spaces.mp4" -e -f')

new=shlex.split(data)

print(new)

yields,

['cmd', '-a', '-b', '-c', 'file with spaces.mp4', '-e', '-f']
merit_2
  • 461
  • 5
  • 16
3

This can be accomplished with the built-in shlex module, as such:

import shlex
s = shlex.split('cmd -a -b -c "file with spaces.mp4" -e -f', posix=False)
print(s)

The purpose of posix=False passed into split is to preserve the quotation marks around the multi-word file name, since your desired output formats it like that. If you don't want to preserve the quotes, you can remove the posix argument.

Michael Jarrett
  • 425
  • 3
  • 10
0

Use a regular expression to match either:

  • " eventually followed by another " ("[^"]*"), or
  • any non-space characters (\S+):
input = 'cmd -a -b -c "file with spaces.mp4" -e -f'
output = re.findall('"[^"]*"|\S+', input)
CertainPerformance
  • 356,069
  • 52
  • 309
  • 320
  • 1
    This is actually a good answer, I don't know why someone down vote it, this has an advantage over other answers where i still get the file name with spaces including the quotations `['cmd', '-a', '-b', '-c', '"file with spaces.mp4"', '-e', '-f']` , thank you – Mahmoud Elshahat Dec 30 '19 at 00:50