5

I have a string which has multiple brackets. Let says

s="(a(vdwvndw){}]"

I want to extract all the brackets as a separate string.

I tried this:

>>> brackets=re.search(r"[(){}[]]+",s)
>>> brackets.group()

But it is only giving me last two brackets.

'}]'

Why is that? Shouldn't it fetch one or more of any of the brackets in the character set?

user
  • 71
  • 1
  • 1
  • 5
  • 2
    see `re.findall` http://stackoverflow.com/questions/7724993/python-using-regex-to-find-multiple-matches-and-print-them-out – C8H10N4O2 Jun 10 '15 at 20:02
  • https://regex101.com/ is a great tool to build and test regular expressions. – asimoneau Jun 10 '15 at 20:04
  • Note that `re.search` only produces the first match. – TigerhawkT3 Jun 10 '15 at 20:35
  • Oh. When I do findall it gives me a list of matches which I can then join. Search and match only give the first match is it right? It wont go further and check for other matches? And why is that? I used "+" for that only so that it can check for one or more. – user Jun 10 '15 at 21:05
  • `+` means 1 or more matching characters in a row. If there are non-matching characters between groups of matching characters, `re.search` only finds the first group, while `re.match` only finds the first group and then only if it's at the beginning of the string. – TigerhawkT3 Jun 10 '15 at 22:08

4 Answers4

7

You have to escape the first closing square bracket.

r'[(){}[\]]+'

To combine all of them into a string, you can search for anything that doesn't match and remove it.

brackets = re.sub( r'[^(){}[\]]', '', s)
TigerhawkT3
  • 48,464
  • 6
  • 60
  • 97
4

Use the following (Closing square bracket must be escaped inside character class):

brackets=re.search(r"[(){}[\]]+",s)
                           ↑
karthik manchala
  • 13,492
  • 1
  • 31
  • 55
  • Hi Thanks for your comment. But I am not able to understand why I need to escape the last square bracket. Shouldnt the character class searches for any of those character and + would make it one or more? – user Jun 10 '15 at 21:01
  • @user you are right.. but how will regex know which square bracket is the closing one? inner one? or outer one?.. thats why you need to escape the inner one.. hope you got my point.. – karthik manchala Jun 10 '15 at 21:04
  • Yup got that. But why search only fetches one match? I am using "+" so shouldnt it fetch one or more? – user Jun 10 '15 at 21:10
  • @user `+` means one or more of the specified pattern.. so the regex `[(){}[\]]+` would match `{` or `{]` or `({]{)}` etc.. but only the first occurance of the match is returned.. to get all the matches.. you have to use `re.findAll` – karthik manchala Jun 10 '15 at 21:20
2

The regular expression "[(){}[]]+" (or rather "[](){}[]+" or "[(){}[\]]+" (as others have suggested)) finds a sequence of consecutive characters. What you need to do is find all of these sequences and join them.

One solution is this:

brackets = ''.join(re.findall(r"[](){}[]+",s))

Note also that I rearranged the order of characters in a class, as ] has to be at the beginning of a class so that it is not interpreted as the end of class definition.

Michał Trybus
  • 11,526
  • 3
  • 30
  • 42
1

You could also do this without a regex:

s="(a(vdwvndw){}]"
keep = {"(",")","[","]","{","}"}
print("".join([ch for ch in s if ch in keep]))
((){}]
Padraic Cunningham
  • 176,452
  • 29
  • 245
  • 321