What happened
To understand why the lookbehind behave in a seemingly incoherent way, remember that the regex engine goes from left to right and returns the first match it finds.
Let's look at the steps it takes to match (?<=ab|a)\w+
on abc
:
- the engine starts at
a
. There isn't anything before, so the lookbehind fails
- transmission kicks in, the engine is now considering a match starting from
b
- the lookbehind tries the first item of the alternation (
ab
) which fails
- ... but the second item (
a
) matches
\w+
matches the rest of the string
The overall match is therefore bc
, and the regex engine hasn't broken any of its rule in the process.
How to fix it
If C# supported the \K
escape sequence, you could just use the greediness of ?
to do the work for you (demo here):
string1(?:optionalstring2)?\K\w+
However, this (sadly) isn't the case. It therefore seems that you are stuck with using a capturing group:
string1(?:optionalstring2)?(\w+)