3

I would like to split a text using space as the delimiter unless the space is found between quotation marks

for example

string = "my name is 'solid snake'"
output = ["my","name","is","'solid snake'"]
jonrsharpe
  • 115,751
  • 26
  • 228
  • 437
raiden RTF
  • 41
  • 3
  • Have a look at https://stackoverflow.com/questions/366202/regex-for-splitting-a-string-using-space-when-not-surrounded-by-single-or-double – Chris Lear Apr 20 '21 at 11:39
  • 1
    @ChrisLear that's tagged as Java, i don't think it will work with python's flavour of regex. – Umar.H Apr 20 '21 at 11:40
  • 1
    No need bothering with regex, `import shlex; shlex.split(text, posix=False)`. See [the answer](https://stackoverflow.com/a/45031497/3832970). – Wiktor Stribiżew Apr 20 '21 at 12:05
  • 2
    @WiktorStribiżew, despite I had an answer down I concur with that link. I had no idea about that module! It works wonders. – JvdV Apr 20 '21 at 12:06

2 Answers2

2

Looping through the string:

string = "my name is 'solid snake'"
quotes_opened = False
out = []
toadd = ''
for c, char in enumerate(string):
    if c == len(string) - 1: #is the character the last char
        toadd += char
        out.append(toadd); break
    elif char in ("'", '"'): #is the character a quote
        if quotes_opened:
            quotes_opened = False #if quotes are open then close
        else:
            quotes_opened = True #if quotes are closed the open
        toadd += char
    elif char != ' ':
        toadd += char #add the character if it is not a space
    elif char == ' ': #if character is a space
        if not quotes_opened: #if quotes are not open then add the string to list
            out.append(toadd)
            toadd = ''
        else: #if quotes are still open then do not add to list
            toadd += char
print(out)
Sid
  • 2,174
  • 1
  • 13
  • 29
1

A brute force way would be:

string = "my name is 'solid snake'"
output = ["my","name","is","'solid snake'"]

ui= "__unique__"

string2= string.split("'")
print(string2)

for i, segment in enumerate(string2):
    if i %2 ==1:
        string2[i]=string2[i].replace(" ",ui)
        
print(string2)

string3= "'".join(string2)

print(string3)

string4=string3.split(" ")

print(string4)

for i, segment in enumerate(string4):
    if ui in segment:
        string4[i]=string4[i].replace("__unique__", " ")

print()
print(string4)
tfv
  • 6,016
  • 4
  • 36
  • 67