Plug and Play
^(?!01$)(?!0011$)(?!000111$)(?!00001111$)(?=[01]{1,8}$)0*1*$
You cannot perfectly translate this to a regular expression, but you can get close, by ensuring that the input does not have equal number of 0
and 1
. This matches up to 8 digits.
How it works
^
first you start from the beginning of a line
(?!01$)
ensure that the characters are not 01
(?!0011$)
ensure that the characters are not 0011
- the same for
000111
and 00001111
- then ensure that there are from
1
to 8
zeroes and ones (this is needed, to ensure that the input is not made of more digits like 000000111111
, because their symmetry is not verified)
- then match these zeroes and ones till the end of the line
- for longer inputs you need to add more text, for up to 10 digits it is this:
^(?!01$)(?!0011$)(?!000111$)(?!00001111$)(?!0000011111$)(?=[01]{1,10}$)0*1*$
(you jump by 2 by adding one more symmetry validation)
- it is not possible by other means with regular expressions alone, see the explanation.
Explanation
The A
and B
are easy, as you saw 0+
and 1+
. The concatenations in S
after the first also are easy: 00+
, 0
, 11+
, 1
, that all mixed into one lead to (0+|1+)
. The problem is with the first concatenation 0S1
.
So the problem can be shorten to S = 0S1
. This grammar is recursive. But neither left linear
nor right linear
. To recognize an input for this grammar you will need to "remember" how many 0
you found, to be able to match the same amount of 1
, but the finite-state machines that are created from the regular grammars (often and from regular expressions) do not have a computation history. They are only states and transitions, and the machinery "jumps" from one state to the other and does not remember the "path" traveled over the transitions.
For this reason you need more powerful machinery (like the push-down automaton) that can be constructed from a context-free grammar (as yours).