3

I have a string, s="abaaaababbb".

I am using findall method and I want to know all the occurrences of (ab)+. The code that I am using is:

import re
s = "abaaaababbb"
x = re.findall("[ab]+",s)
print(x)

Output: ['abaaaababbb']

Instead I wanted output like: ['ab' , 'abab']

How to write the correct regular expression for the same?

Hamed Ghasempour
  • 435
  • 3
  • 12

1 Answers1

2

The regex you mentioned in your question ((ab)+) is almost correct.

You just need to make the capturing group a non-capturing one:

(?:ab)+

This is because findall will return all the groups (as opposed to all the matches) if you have any capturing groups in the regex.

Sweeper
  • 213,210
  • 22
  • 193
  • 313
  • What is a capturing group I don't understand and what if in place of 'ab', I want to match '()' –  Jul 03 '19 at 07:14
  • @SomShekharMukherjee you’d still need a non-capturing group. You’d also need to escape the parentheses, so `(?:\(\))+`. You should really learn more about regex first. – Sweeper Jul 03 '19 at 07:16
  • @SomShekharMukherjee See [the regex reference](https://stackoverflow.com/q/22937618/5133585) – Sweeper Jul 03 '19 at 07:17
  • Thanks alot, I found what I was looking for (y) –  Jul 03 '19 at 07:26