1

If I have a string:

s = aaa{bbb}ccc{ddd{eee}fff}ggg

is it possible to find all matches based on outer curly braces?

m = re.findall(r'\{.+?\}', s, re.DOTALL)

returns

['{bbb}', '{ddd{eee}']

but I need:

 ['{bbb}', '{ddd{eee}fff}']

Is it possible with python regex?

John Saunders
  • 160,644
  • 26
  • 247
  • 397
user2624744
  • 693
  • 8
  • 23
  • 2
    In the general case [you don't](http://stackoverflow.com/questions/1099178/matching-nested-structures-with-regular-expressions-in-python) – fredtantini Jan 14 '15 at 16:15
  • Python doesn't support recursion in regex (neither does it support balancing groups). – Lucas Trzesniewski Jan 14 '15 at 16:15
  • Unlike forum sites, we don't use "Thanks", or "Any help appreciated", or signatures on [so]. See "[Should 'Hi', 'thanks,' taglines, and salutations be removed from posts?](http://meta.stackexchange.com/questions/2950/should-hi-thanks-taglines-and-salutations-be-removed-from-posts). – John Saunders Jan 14 '15 at 16:16

3 Answers3

4

If you want it to work in any depth, but don't necessarily need to use regex, you can implement a simple stack based automaton:

s = "aaa{bbb}ccc{ddd{eee}fff}ggg"

def getStuffInBraces(text):
    stuff=""
    count=0
    for char in text:
        if char=="{":
            count += 1
        if count > 0:
            stuff += char
        if char=="}":
            count -= 1
        if count == 0 and stuff != "":
            yield stuff
            stuff=""

getStuffInBraces is an iterator, so if you want a list of results, you can use print(list(getStuffInBraces(s))).

L3viathan
  • 26,748
  • 2
  • 58
  • 81
3
{(?:[^{}]*{[^{]*})*[^{}]*}

Try this.See demo.

https://regex101.com/r/fA6wE2/28

P.S It will only work the {} is not more than 1 level deep.

vks
  • 67,027
  • 10
  • 91
  • 124
1

You could use this regex also.

\{(?:{[^{}]*}|[^{}])*}

DEMO

>>> s = 'aaa{bbb}ccc{ddd{eee}fff}ggg'
>>> re.findall(r'\{(?:{[^{}]*}|[^{}])*}', s)
['{bbb}', '{ddd{eee}fff}']

Use recursive regex for 1 level deep.

\{(?:(?R)|[^{}])*}

Code:

>>> import regex
>>> regex.findall(r'\{(?:(?R)|[^{}])*}', s)
['{bbb}', '{ddd{eee}fff}']

But this would be supported by the external regex module.

Avinash Raj
  • 172,303
  • 28
  • 230
  • 274