I want to match the last group, which is enclosed in []
, but may contain one of more []
inside of itself in a nested structure.
I managed, although not elegantly, to get the nested []
matching going using the regex
of python
. This solution works for some cases (such as s1
) but not s2
or s3
when there are multiple such matches. My solution will only match the first one.
Any suggestions? A better regular expression? Or regular expression is not the way to go? Thanks a lot!
In [116]:
s1 = 'AAA [BBB [CCC]]'
s2 = 'AAA [DDD] [EEE]'
s3 = 'AAA [BBB [CCC]] [EEE]'
for s in [s1, s2, s3]:
result = regex.search(r'(?<rec>\[(?:[^\[\]]++|(?&rec))*\])',s,flags=regex.VERBOSE)
print(result.captures('rec'))
['[CCC]', '[BBB [CCC]]'] #I know it is perfect, but I can take the last one in the list
['[DDD]'] #This is the first one, I want the last one, which is [EEE]
['[CCC]', '[BBB [CCC]]'] #same problem as above
Edit:
Thanks a lot of the help, if I have 15 reps I will up-vote ya all. However, sorry for not including the intended result, which should be:
'AAA [BBB [CCC]]' -> '[BBB [CCC]]'
'AAA [DDD] [EEE]' -> '[EEE]'
'AAA [BBB [CCC]] [EEE]' -> '[EEE]'
'000 [[aaa] xxx [yyy [zzz ]]' -> '[[aaa] xxx [yyy [zzz ]]'