1
s = 'hello "ok and @com" name'

s.split()

Is there a way to split this into a list that splits whitespace characters but as well not split white characters in quotes and allow special characters in the quotes.

["hello", '"ok and @com"', "name"]

I want it to be able to output like this but also allow the special characters in it no matter what.

Can someone help me with this?

(I've looked at other posts that are related to this, but those posts don't allow the special characters when I have tested it.)

Jack Murrow
  • 396
  • 3
  • 11
  • Does this answer your question? [Split a string by spaces -- preserving quoted substrings -- in Python](https://stackoverflow.com/questions/79968/split-a-string-by-spaces-preserving-quoted-substrings-in-python) – Abass Sesay Apr 07 '20 at 00:46
  • I just fixed my code and I thought this part was the problem but there was another problem in my code (I just found out).. – Jack Murrow Apr 07 '20 at 01:11

2 Answers2

1

You can do it with re.split(). Regex pattern from: https://stackoverflow.com/a/11620387/42346

import re

re.split(r'\s+(?=[^"]*(?:"[^"]*"[^"]*)*$)',s) 

Returns:

['hello', '"ok and @com"', 'name']

Explanation of regex:

\s+             # match whitespace
(?=             # start lookahead
   [^"]*        # match any number of non-quote characters
   (?:          # start non-capturing group, repeated zero or more times
      "[^"]*"   # one quoted portion of text
      [^"]*     # any number of non-quote characters
   )*           # end non-capturing group
   $            # match end of the string
)               # end lookahead
mechanical_meat
  • 163,903
  • 24
  • 228
  • 223
0

One option is to use regular expressions to capture the strings in quotes, delete them, and then to split the remaining text on whitespace. Note that this won't work if the order of the resulting list matters.

import re

items = []
s = 'hello "ok and @com" name'
patt = re.compile(r'(".*?")') 

# regex to find quoted strings
match = re.search(patt, s)
if match:
    for item in match.groups():
        items.append(item)

# split on whitespace after removing quoted strings
for item in re.sub(patt, '', s).split():
    items.append(item)

>>>items
['"ok and @com"', 'hello', 'name']
Eric Truett
  • 2,970
  • 1
  • 16
  • 21