Possible Duplicate:
Matching Nested Structures With Regular Expressions in Python
I can't wrap my head around this problem. I have a string like the following one:
Lorem ipsum dolor sit amet [@a xxx yyy [@b xxx yyy [@c xxx yyy]]] lorem ipsum sit amet
My task would be to extract the commands (they are always starting with [@ and ending with ]) and their subcommands. A result like
[
[@a xxx yyy [@b xxx yyy [@c xxx yyy]]], # the most outer
[@b xxx yyy [@c xxx yyy]], # the middle one
[@c xxx yyy] # the inner most
]
would be highly appreciated. The problem is that these kind of commands can occur in very long text messages, so a "performant" solution would be nice.
I was toying around with some regex patterns mostly of the time something like
(\[@.*?\]\s) # for the outer one
but i have seen no light in matching the middle and inner one. To make it more complicated, the amount of nested commands is variable... Might some special regex be the solution? I have read about lookaheads and lookbehinds but no idea how to use them in this special case.
Thank a bunch!
UPDATE
@Cyborgx37 pointed me to another post that uses the pyparsing package. It would be nice to have a solution without an external package or library. But pyparsing definately solves that problem!