0

Is it possible to construct a regex that matches a pattern multiple times?

For example searching for ff in fff would give two matches. Their starting position would be 0 and 1 respectively.

Frej Connolly
  • 1,374
  • 1
  • 10
  • 11

2 Answers2

2

Yes, it is possible. You can use positive lookahead for this.

>>> import re
>>> [m.start() for m in re.finditer(r'f(?=f)', 'fff')]
[0, 1]
Ashwini Chaudhary
  • 244,495
  • 58
  • 464
  • 504
  • The function `finditer` only returns the `non-overlapping` instances. "Return an iterator yielding MatchObject instances over all non-overlapping matches for the RE pattern in string" see http://docs.python.org/2/library/re.html –  Feb 09 '14 at 18:07
  • @Desolator Have you read the docs of [`re.findall`](http://docs.python.org/2/library/re.html#re.findall) that you used in your answer? – Ashwini Chaudhary Feb 09 '14 at 18:10
  • The drawback is `m.groups()` is no longer very useful – nodakai Feb 09 '14 at 18:12
  • This works great on small instances. But it's too slow when the pattern length is >70000 characters long on a big text 6-7 MB. – Frej Connolly Feb 09 '14 at 18:18
0

Yes. Use findall(string[, pos[, endpos]])

Similar to the findall() function, using the compiled pattern, but also accepts optional pos and endpos parameters that limit the search region like for match().

i.e. Each time you will begin search from the m.start() of the previous match + 1.