0

Hello I have read into using regex but I cannot understand how to use it.

I would like to split a string to make a list, splitting by spaces except when enclosed by # # or quotes " "

values = '2 #room 2.# 5 1 -1 -1'

or values = '2 "room 2." 5 1 -1 -1'

just using split() results in:

['2', '#room', '2.#', '5', '1', '-1', '-1']

I would like it to output the name of the room without the # and without it split up because of the space:

['2', 'room 2.', '5', '1', '-1', '-1']
Jay
  • 1
  • 1
  • 1
    There is answer: https://stackoverflow.com/questions/1059559/split-strings-with-multiple-delimiters –  Aug 16 '17 at 02:04
  • @Masiama: I don't think that question addresses the issue here. There's no mention of quotes in the first few answers anyway. – Blckknght Aug 16 '17 at 02:14
  • Replace `#` and all non-`"` qualitifiers with `"`, and use [CSV parser](https://docs.python.org/2/library/csv.html), see [this demo](https://ideone.com/aC7BYu). If your qualifier chars are consistent across all the input, this should work. – Wiktor Stribiżew Aug 16 '17 at 08:41

2 Answers2

1

You can do something like this (Replace # with " and then use shelex split)

import shlex
values = '2 #room 2.# 5 1 -1 -1'
print(shlex.split(values.replace('#','"')))

Output

['2', 'room 2.', '5', '1', '-1', '-1']

Based on awesome observation by Casimir et Hippolyte see the comment

if suppose the values is

 values = '2 #"room 2."# 5 1 -1 -1'

Then what to do .solution would be make the string simple replace #" and "# to just "

import shlex
values = '2 #"room 2."# 5 1 -1 -1'

val=values.replace('#"','"')
print(shlex.split(val.replace('"#','"')))

Output

['2', 'room 2.', '5', '1', '-1', '-1']
Hariom Singh
  • 3,512
  • 6
  • 28
  • 52
0

Instead of describing the delimiter for re.split, use re.findall and describe the items:

re.findall(r'(?<=")[^"]*(?=")|(?<=#)[^#]*(?=#)|[^\s"#]+', values)
Casimir et Hippolyte
  • 88,009
  • 5
  • 94
  • 125