I am trying to find a way to detect ,
and or
in a string even if they are repeated. So even a string such as one , , or or, two
with re.split() should return "one" and "two".
So far this is what I have (Using Python 3.10):
import re
pattern = re.compile(r"(?:\s*,\s*or\s*|\s*,\s*|\s+or\s+)+", flags=re.I)
string = "one,two or three , four or five or , or six , oR , seven, ,,or, ,, eight or qwertyor orqwerty,"
result = re.split(pattern, string)
print(result)
which returns:
['one', 'two', 'three', 'four', 'five', 'six', 'seven', 'eight', 'qwertyor orqwerty', '']
My issue so far is if I have consecutive or
, my pattern will only recognize every other or
. For example:
string = "one or or two"
>>> ['one', 'or two']
string = "one or or or two"
>>> ['one', 'or', 'two']
Notice in the first example the second element contains or
and in the second example or
is an element by itself.
Is there a way to get around this? Also if there is a better way of separating these strings that would be greatly appreciated as well.