So I have the following text:
a
111
b
222
c
333
d
and I want to capture all contents between these alphabetical delimiters. So I tried
import re
test_str=r"""a
111
b
222
c
333
d
"""
res = re.findall(r"[a-z]{1}\n([\d\D]+?)\n[a-z]{1}", test_str)
Note that [\d\D]
is for any character including newlines, because in real examples the contents in between may be complicated and contain many lines. Anyway, my expected output is
['111', '222', '333']
but instead, the actual result is
['111', '333']
The reason I guess is that when the first occurrence a\n111\nb
is matched, it is somehow "taken away" from the string and doesn't enter the subsequent matching process, leading to the error.
Is there any simple way to capture contents between such consecutive delimiters? Thanks in advance.