Capturing repeated groups into different groups

Question

I want to capture repeated groups in Python as a separate list:

match = re.match(r'!((?:abc|123)+)!', '!abc123abc!').groups()
print(match)
print(len(match))

This gives back a tuple with a single element:

('abc123abc',)
1

How can I get the following output?

('abc', '123', 'abc',)
3

Following this helpful article on capturing repeated groups I now understand the earlier problem I had, trying to repeat a capturing group instead of capturing a repeated group. But still I don't understand how or if it is possible to capture different groups for better post-processing.

Please note that I cannot do without the pre-/suffix, because this also contains multiple capturing groups. My actual use case differs a little bit from this MWE, but should be clear enough.

Try `>>> re.findall(r'abc|123', x)` `['abc', '123', 'abc']`. — han solo, Mar 20 '19 at 14:51
This would work, but in the case I want to use it, I cannot do without the pre-/suffix, which also contains several capturing groups. — Felix, Mar 20 '19 at 14:56
`re.findall` returns a `list`. You cannot use `groups()` on `re.findall`. Are you talking about `re.search` ? Note, `re.match` will match from `start` — han solo, Mar 20 '19 at 14:58
Use PyPi regex model and grab `.captures(1)` with `r'!(abc|123)+!'` — Wiktor Stribiżew, Mar 20 '19 at 15:00
@WiktorStribiżew This gives back `['abc123abc']`, not `['abc', '123', 'abc']`. — Felix, Mar 20 '19 at 15:02
@hansolo I fixed my earlier response, because I noticed too late my erroneous usage of `groups()`. — Felix, Mar 20 '19 at 15:03
No way, `regex.match(r'!(abc|123)+!', '!abc123abc!').captures(1)` [yields expected output](https://rextester.com/RFXH47168), `['abc', '123', 'abc']` — Wiktor Stribiżew, Mar 20 '19 at 15:04
Sorry but I skipped your change on the regex. Thank you it works! — Felix, Mar 20 '19 at 15:07

score 0 · Answer 1 · answered Mar 20 '19 at 15:13

This resembles the correct answer by @WiktorStribiżew while staying more easily comparable to my original question

import regex as re
match = re.match(r'!(abc|123)+!', '!abc123abc!').captures(1)
print(match)
print(len(match))

Which correctly outputs

['abc', '123', 'abc']
3

This works, because the regex module (not to be confused with Python's internal re module) handles groups differently when using the capture() method, i.e. not overwriting them (as outlined in the article in my original question), but rather append them, not overwriting the index. This can be followed in the section Notes on named capture groups in the regex package's official documentation.

Capturing repeated groups into different groups

1 Answers1