Why the output of re.findalll and re.finditer are different when using parenthesis?

Question

In Python 3.6, why re.findall returns different items in the following example compared to re.finditer?

text = "He was carefully disguised but captured quickly by 10 caps."

print(re.findall(r"ca(p)", text))

for i in re.finditer(r"ca(p)", text):
    print(i)

findall returns['p', 'p'], while finditer returns two "cap". It happens only when I use parenthesis!

You mean the parentheses at `(p)`? You do know that these are capturing groups? Read the docs on both methods to figure out how they deal with capturing groups. — Sebastian Simon, Jun 15 '18 at 07:51

score -1 · Answer 1 · answered Jun 15 '18 at 07:55

-1

finditer returns an iterable of matched objects which is contain all the captured groups and when you print a matched object it returns the first matched group which is contain the whole matched string.

If you want the string matched by captured groups you need to use group() method that accepts the number of captured group as its argument.

for i in re.finditer(r"ca(p)", text):
    print(i.group(1))

In other hand, re.findall() returns strings matched by all the captured groups in your regex in a list. It's roughly equivalent to following finditer() code:

[i for m in re.finditer(r"ca(p)", text) for i in m.groups()]

answered Jun 15 '18 at 07:55

Mazdak

105,000
18
159
188

Please explain the reason of your down vote! – Mazdak Jun 15 '18 at 08:51
Thank you. It seems that I need to go through group, groups, and `findall()` functionality and role of parentheses in order to understand it. BTW, I didn't give the down vote. – Mohammad Jun 17 '18 at 05:11

Why the output of re.findalll and re.finditer are different when using parenthesis?

1 Answers1