2

Consider the following:

a;b{c;d;}e{f;}

How can I split this into three groups like so:

Group 1: a;
Group 2: b{c;d;}
Group 3: e{f;}

I'm learning regular expressions and I was wondering if I could include a if-then-else type logic into the expression.

What I have now:

(.*?{.*?})

This creates two groups like so:

Group 1: a;b{c;d;}
Group 2: e{f;}

which is not what I want as a; and b{c;d;} are merged.

Pseudo if-then-else:

  1. Select all characters until either a semi-colon or open curly bracket.
  2. If semi-colon then stop and complete group.
  3. Else if open curly bracket then continue selecting all characters until closing curly bracket.

Thanks.

user3417614
  • 131
  • 2
  • 7
  • The simplest expression you can use is [`re.findall(r"[^;{]+;|[^}]+\}", text)`](http://ideone.com/7tfoSC) – Jerry Mar 24 '15 at 09:25

2 Answers2

2

Use re.findall

>>> re.findall(r'[^;{]+;?(?:{[^}]*})?', 'a;b{c;d;}e{f;}')
['a;', 'b{c;d;}', 'e{f;}']

OR

This one is the more appropriate one.

>>> re.findall(r'[^;{]+;|[^;{]+(?:{[^}]*})?', 'a;b{c;d;}e{f;}')
['a;', 'b{c;d;}', 'e{f;}']
  • [^;{]+ negated character class which matches any char but not of ; or { one or more times.

  • | OR

  • [^;{]+ any char but not of ; or { followed by an

  • (?:{[^}]*})? optional curly brace block.

Avinash Raj
  • 172,303
  • 28
  • 230
  • 274
1
(?<=;)(?![^{]*})|(?<=}(?!$))

You can use this to split with regex module as re does not support split at 0 width assertions.

import regex
x="a;b{c;d;}e{f;}"
print regex.split(r"(?<=;)(?![^{]*})|(?<=}(?!$))",x,flags=regex.VERSION1)

Output:['a;', 'b{c;d;}', 'e{f;}']

See demo.

https://regex101.com/r/tJ2mW5/8#python

vks
  • 67,027
  • 10
  • 91
  • 124