I'm having trouble understanding regex behaviour when using lookahead.
I have a given string in which I have two overlapping patterns (starting with M
and ending with p
). My expected output would be MGMTPRLGLESLLEp
and MTPRLGLESLLEp
. My python code below results in two empty strings which share a common start with the expected output.
Removal of the lookahead (?=)
results in only ONE output string which is the larger one. Is there a way to modify my regex term to prevent empty strings so that I can get both results with one regex term?
import re
string = 'GYMGMTPRLGLESLLEpApMIRVA'
pattern = re.compile(r'(?=M(.*?)p)')
sequences = pattern.finditer(string)
for results in sequences:
print(results.group())
print(results.start())
print(results.end())