Yesterday I asked a question latest Perl won't match certain regexes more than 32768 characters long to explain why
>./perl -e 'print (("a".("f"x32768)."a") =~ /a(?:[^a]|bb)*a/)'
does not work as expected (does not match) and the answer was, because of a backtracking depth limitation/bug of Perl regexes.
Why does the above match involve backtracking? The first a
is matched, then, all the f
s are matched by [^a]*
, no need for the alternative bb
, and then the last a
is not [^a]
, neither bb
, then the engine stops matching ([^a]|bb)*
and then matches the last a
and done. No backtracking needed.
The bug, occurs, when there is too much backtracking. OK, so the whole thing should work like this. The regex engine starts to match, proceeds to the end without backtracking, finishes with a match, no backtracking needed, the bug is not involved. Why doesn't it work that way?