5

Apologies for the ambiguous title, but I don't know how to word my problem in such a way that makes sense in a single sentence.

So I have some simple regex code to extract code between brackets.

^.*\((.*)\).*

This successfully works in Python with the following code.

m = re.search( "^.*\((.*)\).*" ,input)
if m:
    print(m.groups()[0])

My problem occurs when a closing bracket ) may be inside the outermost brackets. For example, my current code when given

nsfnje (19(33)22) sfssf

as an input would return

19(33

but I would like it to return.

19(33)22

I'm not sure how to fix this, so any help would be appreciated!

NPE
  • 486,780
  • 108
  • 951
  • 1,012
Geesh_SO
  • 2,156
  • 5
  • 31
  • 58

2 Answers2

9
>>> input = "nsfnje (19(33)22) sfssf"
>>> re.search( "\((.*)\)" ,input).group(1)
'19(33)22'

Note that this searches for outermost parentheses, even if they are unbalanced (e.g. "(1(2)))))"). It is not possible to search for balanced parentheses using a single standard regular expression. For more information, see this answer.

Community
  • 1
  • 1
NPE
  • 486,780
  • 108
  • 951
  • 1,012
  • [This question](http://stackoverflow.com/questions/5454322/python-how-to-match-nested-parentheses-with-regex) has more details on why regexes don't work for general nesting and some alternative approaches if you do need it. – Danica Apr 07 '13 at 16:38
  • If you want to do the same with curly braces simply replace `\(` with `{` and `\)` with `}`: `re.search("{(.*)}", text, re.S).group(1)`. Also, to make `.` match line breaks, `re.S` or `re.DOTALL` is required: `re.search("\((.*)\)", text, re.DOTALL).group(1)`. – Wiktor Stribiżew Feb 23 '22 at 21:05
0

You code does not give 19(33, it gives 33)22.

The problem is that the ^.* at the start of your regex matches all the way up to the last ( in the string, whereas you actually want to match from the first ( in the string.

If you just want what is within the outermost brackets, then remove the .* at the start of your regex, and you may as well remove the ending .* also as it similarly serves no purpose.

"\((.*)\)"

If you want the match of the whole line/string as well as what is within the brackets, then make the first * match lazily by adding a ?

"^.*?\((.*)\).*"

or better, use

"^[^(]*\((.*)\).*"
MikeM
  • 13,156
  • 2
  • 34
  • 47