1

I'm trying to split a string like a && b || c && d || e on both && and || using re.split. I know you can have multiple delimiters by doing re.split("a | b"), however I don't know how to achieve this: re.split("&& | ||"). I attempted to escape the pipes by using re.split("&& | \\|\\|") however this doesn't work.

How do I properly escape this?

idjaw
  • 25,487
  • 7
  • 64
  • 83
LTheanine
  • 19
  • 2
  • 2
    Be warned that you may be mistaken in your regex comprehension. `re.split("a | b")` will split on `a(space)` and `(space)b`. Your own attempt also includes these spaces (which happen to be in your input, and so they will be discarded). – Jongware Apr 09 '16 at 12:40
  • 1
    What do you want as output? – Padraic Cunningham Apr 09 '16 at 12:42

4 Answers4

5

You need to escape the | since it has a special meaning:

>>> import re
>>> s = "a && b || c && d || e"
>>> re.split(r"&&|\|\|", s)
['a ', ' b ', ' c ', ' d ', ' e']

And, to also handle the spaces around the delimiters:

>>> re.split(r"\s(?:&&|\|\|)\s", s)
['a', 'b', 'c', 'd', 'e']

where \s matches a space character, (?:...) is a non-capturing group.

alecxe
  • 462,703
  • 120
  • 1,088
  • 1,195
  • Is it important to mark the parenthesized expression as non-capturing? – Jongware Apr 09 '16 at 12:42
  • 1
    @RadLexus yeah, otherwise the delimiters would also be present in the resulting list. Thanks. – alecxe Apr 09 '16 at 12:43
  • Just ran over to my desktop to check :) Nice caveat! Just a quick follow up question: on my Python 2.7 also works without the `r` prefix: `re.split(" *&& *| *\|\| *", s)` *and* also with doubled backslashes! Basically, OPs original attempt works! (With the associated space problem I mentioned in a comment.) Any reason for that? – Jongware Apr 09 '16 at 12:57
  • 1
    @RadLexus that's the "magic" of the raw string: http://stackoverflow.com/questions/12871066/what-exactly-is-a-raw-string-regex-and-how-can-you-use-it. – alecxe Apr 09 '16 at 13:05
1

str.translate might do the job if you want to split into individual elements:

s = "a && b || c && d || e"

print(s.translate(None,"&|").split())

Which would give you:

['a', 'b', 'c', 'd', 'e']

Or replace the double || with && and then split:

s = "a && b || c && d || e"

print(s.replace(" || "," && ").split(" && "))

Or if you want to keep the spacing, just s.replace("||","&&").split("&&"), whatever you want you can use some variation of the above or combine with str.strip.

Padraic Cunningham
  • 176,452
  • 29
  • 245
  • 321
0

Did this regex respond to your need ?

import re
r = re.compile("(?:&&|\|\|)")
r.split("a && b || c && d || e")

result :

['a ', ' b ', ' c ', ' d ', ' e']
Samuel LEMAITRE
  • 1,041
  • 7
  • 8
0

Try this

data = "a && b || c && d || e"
import re
spl = re.split("(?:\|\||&&)",data)
print spl

or else use find all with regex negation

import re
data = "a && b || c && d || e"
data2 = re.findall("[^&|]{2}",data)
print data2
mkHun
  • 5,891
  • 8
  • 38
  • 85