I have a text file. I need to find a part of the file that starts with some arbitrary pattern, and then capture everything between the pattern and it's closing paren. This pattern may appear multiple times in the file. "Start (" will always appear right before the pattern. Example:
start
(
pattern
(
stuff,
stuff,
randomThing
(
random stuff
)
)
)
start
(
notThePattern
(
otherStuff,
otherStuff
)
)
start
(
pattern
(
moreStuff,
moreStuff
)
)
I would want to get [Start(Pattern(stuff,stuff,randomThing(random stuff))), Start(Pattern(moreStuff,moreStuff)) ].
The way i've done it is with this code:
def myFunct(pattern, input):
allElements = []
match = re.search("start\s*?\(\s*?" + pattern, input)
while (match != None):
index = match.start()
element = getElementEndIndex(line[index:])
allElements.append(element)
input = input[index+len(element):]
match = re.search("start\s*?\(\s*?" + pattern, input)
getElementEndIndex just uses a stack to find the closing paren and it's index.
Is this the only way to do this? Can it be solved with just a regex? If not, is there a better way of running the regex that I do have?
Pattern can appear multiple times within a "start" section. Start cannot be within another start section though.
start
(
pattern
()
blah
()
pattern
()
)
is possible, but
start
(
pattern
()
start
()
)
is NOT