How to express a regular expression over the alphabet {a, b, c}
that doesn't contain the contiguous sub-string baa
?
Asked
Active
Viewed 220 times
-1

Shangchih Huang
- 319
- 3
- 11
1 Answers
2
If your regex flavor supports negative lookaheads, then it's relatively simple. E.g. in php it looks like this:
^^(?:(?!baa)[abc])*$
Demo here.
Explanation:
^...$
makes sure we match the entire line[abc]
is a character class that defines the alphabet(?!baa)
is the negative lookahead. It checks for every position if it is followed bybaa
. If it is, then it's not a match- finally, we group these two with a non-capturing group:
(?:...)
and repeat them as many times as fits into the line:(?:...)*
Update
Updated the demo and the regex according to ClasG -s comment. Indeed, to make sure it fails for a simple baa
, the lookahead must come first, then the character class.
-
[`^(?:[abc](?!baa))*$`](https://regex101.com/r/2jIewo/1/) matches `baa`. – Wiktor Stribiżew Nov 17 '17 at 09:35
-
Swap the tests - `^(?:(?!baa)[abc])*$` and it'll work. I.e. look-ahead prior to character class. – SamWhan Nov 17 '17 at 09:38
-
@WiktorStribiżew The question isn't a dup of the Q you say (though it lacks a lot). To use the answer, it may be helpful though. – SamWhan Nov 17 '17 at 09:40
-
@ClasG thanks, updated – Tamas Rev Nov 17 '17 at 09:52