Why does this regex only return the last match?

Question

I don't understand why this regex only returns the last match:

import re
text = """
Chicken chicken chicken chicken chicken chicken.

#=================
# @title   Chicken
# @author  Me
#=================
Chicken chicken chicken.
"""

rx = r"#=+\n(?:#\s*@(\w+)\s+(.*)\n)+#=+"
for match in re.finditer( rx, text ):
    print match.groups()

# Output:
# ('author', 'Me')

I would expect this regex to return [ ('title', 'Chicken'), ('author', 'Me') ], but it seems to only return the last match. This does not change if I set the flag re.M (multiline), and the flag re.DOTALL is not what I intend here.

For clarity, you can visualise the regex here, it seems to be what I intended, namely:

From the first comment line #===...
Find and capture the next lines with the format # @(word) (anything)

@DietrichEpp But the non-capturing group (`(?: ...)`) is not overlapping here, is it? — Jonathan H, Sep 17 '17 at 21:36
check https://stackoverflow.com/questions/5060659/python-regexes-how-to-access-multiple-matches-of-a-group — Ben, Sep 17 '17 at 21:38
The grouping doesn't matter--what matters is the whole regex. Because the regex starts with `#=\n` and ends with `#=\n`, it can only match once between the `#==` and `#==` lines. — Dietrich Epp, Sep 17 '17 at 21:42
@DietrichEpp Well, I understand the problem, but does that mean I cannot do this with regexes? The suggestion of @coldspeed does not achieve the same thing; is there a way to enforce the constraint that matching lines should be just below the line `#===`? — Jonathan H, Sep 17 '17 at 21:54
Right, apparently this is a [known issue](https://bugs.python.org/issue7132) for which a fix has been deemed unnecessary. — Jonathan H, Sep 17 '17 at 21:59
I managed to do it with two regexes instead. The first one `rx1 = r"#=+(.*?)#=+"` used with the flag `re.DOTALL` extracts the blocks of comments fenced by `#==` lines. The second one `rx2 = r"(?:#\s*@(\w+)\s+(.+))"` matches individual lines within each block. — Jonathan H, Sep 17 '17 at 22:15
Actually, I think the comment by @n611x007 in [this post](https://stackoverflow.com/q/4963691/472610) is a "better" duplicate. — Jonathan H, Sep 17 '17 at 22:18

Why does this regex only return the last match?

0 Answers0