2

I need a construction that boils down to this: "at least 1 occurrence of (substring) a, followed by the SAME number of occurrences of (substring) b". So "ab", "aaabbb" and "aaaaabbbbb" are accepted, "aab" or "aaabbbb" are not.

I found that if the number of occurrences was fixed (say 5) I could use re.compile("a{5}b{5}") , but I don't know the number of occurrences. I just need them to be equal. I tried re.compile("a{x}b{x}") but that was wishful thinking I guess.

1 Answers1

1

The built-in re does not support possessive quantifiers/atomic groups and does not support recursion, nor balanced groups, all those features that could help you build this pattern.

Thus, the easiest solution is to install the PyPi regex library with pip install regex and then use the How can we match a^n b^n? solution.

Otherwise, throw in some Python code and a simple (a+)(b+) regex:

import re
texts = [ "ab", "aaabbb", "aaaaabbbbb", "aab", "aaabbbb" ]
for text in texts:
    match = re.fullmatch(r'(a+)(b+)', text)
    if len(match.group(1)) == len(match.group(2)):
        print( text, '- MATCH' )
    else:
        print( text, '- NO MATCH' )

See this demo yielding

ab - MATCH
aaabbb - MATCH
aaaaabbbbb - MATCH
aab - NO MATCH
aaabbbb - NO MATCH

NOTE:

  • re.fullmatch(r'(a+)(b+)', text) matches an entire string that only contains one or more as and then one or more bs
  • if len(match.group(1)) == len(match.group(2)): is checking the lengh of as and bs, and only passes the strings where their counts are equal.
Wiktor Stribiżew
  • 607,720
  • 39
  • 448
  • 563
  • thank you for this answer Wiktor. Does this also work when a and b are strings of different lengths? So when e.g. a corresponds to 'xyy' and b corresponds to 'kkkmmmmnn'. Does the len(match.group) part count the occurrences of the string or does it return the cumulative length? – Dave Mollet Jun 04 '21 at 06:46
  • @DaveMollet Yes, you just need to group the char sequences, i.e.`a = 'xyy'`, `b = 'kkkmmmmnn'` and then `re.fullmatch(fr'((?:{a})+)((?:{b})+)', text)`. – Wiktor Stribiżew Jun 04 '21 at 09:28
  • 1
    hi Wiktor, I will do that. thanks for the help! – Dave Mollet Jul 01 '21 at 21:47