Why are uncompiled, repeatedly used regexes so much slower in Python 3?

Question

When answering this question (and having read this answer to a similar question), I thought that I knew how Python caches regexes.

But then I thought I'd test it, comparing two scenarios:

a single compilation of a simple regex, then 10 applications of that compiled regex.
10 applications of an uncompiled regex (where I would have expected slightly worse performance because the regex would have to be compiled once, then cached, and then looked up in the cache 9 times).

However, the results were staggering (in Python 3.3):

>>> import timeit
>>> timeit.timeit(setup="import re", 
... stmt='r=re.compile(r"\w+")\nfor i in range(10):\n r.search("  jkdhf  ")')
18.547793477671938
>>> timeit.timeit(setup="import re", 
... stmt='for i in range(10):\n re.search(r"\w+","  jkdhf  ")')
106.47892003890324

That's over 5.7 times slower! In Python 2.7, there is still an increase by a factor of 2.5, which is also more than I would have expected.

Has caching of regexes changed between Python 2 and 3? The docs don't seem to suggest that.

Um, why are you using `timeit` like that? Why not `stmt='re.search(...)'`/`stmt='r.search(...)'` and add `re.compile` to the `setup`? — , Feb 07 '13 at 17:08
The fancy `functools.lru_cache` is the issue here, that is the caching change you are after. Also see http://bugs.python.org/issue16389 — mmgp, Feb 07 '13 at 17:11
@delnan: I wanted the compilation of the regex to be part of the timing. — Tim Pietzcker, Feb 07 '13 at 17:42
That's kind of unfair (in the technical sense) as the cache `re.search` uses will persist between runs. Plus, you can measure it separately and gain more information. I'm very weary of nontrivial parameters for `timeit`. — , Feb 07 '13 at 17:44
@delnan: I guess you're right, but that means that the performance penalty for not compiling regexes is even worse than what these results suggest... — Tim Pietzcker, Feb 07 '13 at 18:44
In the first test, shouldn't `r=re.compile(r"\w+")` be part of `setup` rather than in the `stmt`? — martineau, Feb 07 '13 at 19:16

Martijn Pieters · Accepted Answer · 2019-08-01T10:29:18.157

The code has changed.

In Python 2.7, the cache is a simple dictionary; if more than _MAXCACHE items are stored in it, the whole the cache is cleared before storing a new item. A cache lookup only takes building a simple key and testing the dictionary, see the 2.7 implementation of _compile()

In Python 3.x, the cache has been replaced by the @functools.lru_cache(maxsize=500, typed=True) decorator. This decorator does much more work and includes a thread-lock, adjusting the cache LRU queue and maintaining the cache statistics (accessible via re._compile.cache_info()). See the 3.3.0 implementation of _compile() and of functools.lru_cache().

Others have noticed the same slowdown, and filed issue 16389 in the Python bugtracker. I'd expect 3.4 to be a lot faster again; either the lru_cache implementation is improved or the re module will move to a custom cache again.

Update: With revision 4b4dddd670d0 (hg) / 0f606a6 (git) the cache change has been reverted back to the simple version found in 3.1. Python versions 3.2.4 and 3.3.1 include that revision.

Since then, in Python 3.7 the pattern cache was updated to a custom FIFO cache implementation based on a regular dict (relying on insertion order, and unlike a LRU, does not take into account how recently items already in the cache were used when evicting).

huh. thanks. i wondered why the cache for regexps in strptime doesn't use lru_cache. i hope that lru_cache can be improved as in theory lru is better than wiping and re-starting. — andrew cooke, Jun 26 '13 at 17:46
@andrewcooke: just a heads-up: as of Python 3.7 `re.compile()` uses a FIFO queue cache. It's a good compromise. — Martijn Pieters, Aug 01 '19 at 10:31

Why are uncompiled, repeatedly used regexes so much slower in Python 3?

1 Answers1

Linked