I want to parse the text
in such a way that the brackets with a digit are added to the substring before and after. As far as i understand regex it normally consumes the string which means by default there can't be an overlapping of matches, right? How do i have to adapt pattern_3
to get the desired output?
import re
text = 'a(1)a(2)a(1)a'
pattern = '(a(?:\((\d+)\))?)'
re.findall(pattern, text)
>>> [('a(1)', '1'), ('a(2)', '2'), ('a(1)', '1'), ('a', '')]
pattern_2 = '((?:\((\d+)\))?a(?:\((\d+)\))?)'
re.findall(pattern_2, text)
>>> [('a(1)', '', '1'), ('a(2)', '', '2'), ('a(1)', '', '1'), ('a', '', '')]
pattern_3 = pattern = '((?:\((\d+)\))?a(?=(?:\((\d+)\)))?)'
re.findall(pattern_3, text)
>>> [('a', '', '1'), ('(1)a', '1', '2'), ('(2)a', '2', '1'), ('(1)a', '1', '')]
# desired output:
>>> [('a(1)', '', '1'), ('(1)a(2)', '1', '2'), ('(2)a(1)', '2', '1'), ('(1)a', '1', '')]
Update
Looking for a solution using re
only