0

I coded using python to match the following condition re pattern that identifies the language over the alphabet {a, b} of all strings in which each 'b' is preceded by at least one 'a'

import re

s = '''
a
aaaa
ab
aba
abaabaaaab
b
abb
bba
'''

regex =  re.finditer(r"^([aA]+[bB]?)+", s, re.M)
for i in regex:
    print(i.group())

I'm getting 'ab' at output from 'abb' on 7th line of multi line string. But it should not happen. I don't want it in output. What change must be done in regular expression to rectify this error.

1 Answers1

1

Add $ to the end of your regex:

^([aA]+[bB]?)+$

Whereas ^ marks the start of line, $ marks its end. This way you are forcing a match over the entire line, not just a part of it.

Luka Mesaric
  • 657
  • 6
  • 9