3

Given a string aaabbb is there a way I can write regex to find number of substrings like ab, aabb, aaabbb.

I was doing it by constructing a regex [a]{m}[b]{m} and the iterating over a range of values. But I would like to know if there is a way to do it in a single shot.

Chaitanya Sama
  • 330
  • 3
  • 13

2 Answers2

1

As said in a comment, ^(?:a(?=a*(\1?+b)))+\1$ allows to match such balanced construct using wide spread regex functionality.

Demo

Full explanation here.

That being said if you want to list all overlaping substrings matching a balanced construct, you could use (?=((?:a(?=a*(\2?+b)))+\2)):

(?=                         # Using a lookahead allows to restart matching even if a match has already been found, as a lookaround doesn't "consume" its content.
  (                         # Using a capturing group allows to retrieve the match.
    (?:a(?=a*(\2?+b)))+\2)  # As an outer capturing group has been defined, thus numbered 1, we rewrite the inner part using the proper group 2.
  )
)

Demo

PJProudhon
  • 835
  • 15
  • 17
0

This is not an exact solution, Just a hint to help you. You can take help from this code if it is useful.

import re
s='aaabbb'
def _all_sub(_s,_ss):
    if not _ss:
        return 0
    else:
        for i in range(0,len(_s),1):
            pattern=r'{}'.format(_s[i:i+_ss[0]])
            print(re.search(pattern,s))
        return _all_sub(_s,_ss[1:])
print(_all_sub(s,list(range(len(s)))))

sample output:

....
    <_sre.SRE_Match object; span=(3, 6), match='bbb'>
    <_sre.SRE_Match object; span=(3, 5), match='bb'>
    <_sre.SRE_Match object; span=(3, 4), match='b'>
    <_sre.SRE_Match object; span=(0, 4), match='aaab'>
    <_sre.SRE_Match object; span=(1, 5), match='aabb'>
    <_sre.SRE_Match object; span=(2, 6), match='abbb'>
    <_sre.SRE_Match object; span=(3, 6), match='bbb'>
    <_sre.SRE_Match object; span=(3, 5), match='bb'>
    <_sre.SRE_Match object; span=(3, 4), match='b'>
    <_sre.SRE_Match object; span=(0, 5), match='aaabb'>
    <_sre.SRE_Match object; span=(1, 6), match='aabbb'>
    <_sre.SRE_Match object; span=(2, 6), match='abbb'>
    <_sre.SRE_Match object; span=(3, 6), match='bbb'>
    <_sre.SRE_Match object; span=(3, 5), match='bb'>
    <_sre.SRE_Match object; span=(3, 4), match='b'>
....
Aaditya Ura
  • 12,007
  • 7
  • 50
  • 88