Does anybody know if Python (any version) used NFAs (Non-Deterministic Finite Automata) to evaluate regular expressions or does it use some other mechanism? Please provide links/reference if available.
Asked
Active
Viewed 2,622 times
9
-
1Since most RE engines nowadays allow for non-regular languages to be matched I doubt any modern RE engine actually still uses NFAs or DFAs anymore. – Joey Nov 17 '09 at 13:35
-
1Well, since an RE engine can identify a subset of RE's that are regular, and those are in common use, it makes sense to optimize for those scenarios. So it's entirely possible that they _sometimes_ use NFA's or DFA's. – MSalters Nov 17 '09 at 13:54
2 Answers
8
This should take less than a ms on a DFA:
$ time python3 -c 'import re; re.match("a?"*25+"a"*25, "a"*25)'
real 0m7.273s
Change 25 with 100 and it won't terminate for a lifetime.
Here is how it looks on a DFA (grep):
$ time echo "aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa" |grep "a\?a\?a\?a\?a\?a\?a\?a\?a\?a\?a\?a\?a\?a\?a\?a\?a\?a\?a\?a\?a\?a\?a\?a\?a\?a\?a\?a\?a\?a\?a\?a\?a\?a\?a\?a\?a\?a\?a\?a\?a\?a\?a\?a\?a\?a\?a\?a\?a\?a\?a\?a\?a\?a\?a\?a\?a\?a\?a\?a\?a\?a\?a\?a\?a\?a\?a\?a\?a\?a\?a\?a\?a\?a\?a\?a\?a\?a\?a\?a\?a\?a\?a\?a\?a\?a\?a\?a\?a\?a\?a\?a\?a\?a\?a\?a\?a\?a\?a\?a\?aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa"
aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
real 0m0.063s
There is a great discussion of the topic at http://swtch.com/~rsc/regexp/regexp1.html

Thomas Ahle
- 30,774
- 21
- 92
- 114
6
NFA.
See Friedl's Mastering Regular Expressions, 3rd edition, chapter 4 - table 4-1, page 145.
Google books has a preview to it.

Bart Kiers
- 166,582
- 36
- 299
- 288