Some of regular expression have exponential time of execution due to bad syntax and non-obvious details. Is there any common way to analyze and learn if some regular expression have linear or exponential execution time?
-
apparently [Regex is evil](https://stackoverflow.com/questions/4664850/find-all-occurrences-of-a-substring-in-python#comment83882449_4664850) – uhoh Feb 09 '18 at 06:45
-
1@uhoh, yes - now I know it for sure – Feb 09 '18 at 08:01
-
Thanks for the warning, I'm going to avoid it for now. ;-) – uhoh Feb 09 '18 at 08:02
1 Answers
I tend to just use perl
and switch on use re 'debug';
before doing a regex operation.
This prints the steps the regex is going through to process, and quickly gives an idea of efficiency.
There's no hard and fast rules - the big warning sign I look for is whether this regex will need to backtrack. See: Catastrophic Backtracking
This can happen more easily when you're using lookahead/lookbehind (but doesn't have to).
In the grand scheme of things though - it pays to remember that whilst regex is a programming language, it's starting point is as a power search-and-replace. And thus implementing complicated logic in it, means you're creating code that's hard to maintain and debug - and so you shouldn't.
One of the useful tricks in perl - it can run in much the same way as sed
/grep
/awk
using command line.
So you can enable regex debugging, and then do 'sed style':
perl -pe 's/search/replace' somefile
But then you can add 'debug' regex:
perl -Mre=debug -pe 's/search/replace/' somefile
Which will debug it whilst you're going.

- 52,974
- 7
- 60
- 101
-
2Catastrophic backtracking is the most serious offender - beware of nested quantifiers. – Lucas Trzesniewski Jun 08 '16 at 09:29
-
1