2

So I've used this answer to get really close to what i need.

In my case I want to split on space but not when a part of the string is within quotes.

This is my code:

data = '"abc dfg" ab da'    
PATTERN = re.compile(r'''((?:[^ "']|"[^"]*"|'[^']*')+)''')
wordList = PATTERN.split(data)[1::2]

Gives wordList:

['"abc dfg"', 'ab', 'da']

How can I change the expression so that the string is without the extra quotes?

Like this:

['abc dfg', 'ab', 'da']
Community
  • 1
  • 1
Tryggast
  • 197
  • 2
  • 12

2 Answers2

5

You don't have to complicate your regex, simply iterate on the list and remove " from it. You can do that in many ways, for example using strip('"').

By the way, you have much better solution:

>>> import shlex
>>> shlex.split('"abc dfg" ab da')
['abc dfg', 'ab', 'da']
Maroun
  • 94,125
  • 30
  • 188
  • 241
1
>>> wordList = ['"abc dfg"', 'ab', 'da']
>>> wordList = [word.strip('"') for word in wordList]
>>> wordList
['abc dfg', 'ab', 'da']
Kevin
  • 74,910
  • 12
  • 133
  • 166