0

I'm optimizing a project at work, and during profiling I found out that most of the time is spent compiling regexes on-the-fly and matching patterns.

I'm planning on pre-compiling regexes to save execution time, but I don't know where I should declare or store them. Here's a sample of the current code:

def is_specific_stuff(line):
    expr = ".*(specific|work)_stuff.*"
    match = re.match(expr, line)
    return bool(match)

And the code I would like to write:

def is_specific_stuff(line):
    expr = re.compile('.*(specific|work)_stuff.*')
    return expr.match(line) is not None

However, I don't know how I should handle expr = re.compile(...). Where can I store the compiled regex without making it a module-wide constant (we have several different regexes, and I would like to keep it somewhat close to the code that needs it for readability), and without recreating it with each call?

Thank you

Zeroji
  • 36
  • 1
  • 4
  • declare it as a constant at the top of the script? no need to make this more complicated than it has to be – gold_cy Jun 27 '19 at 11:45
  • @aws_apprentice it's somewhere in a 1200-lines file, which is why I'd rather have the definition close to where it's used. I also feel there's a "cleaner" way to do it, which I missed entirely – Zeroji Jun 27 '19 at 11:47
  • so? that’s the point of IDE’s to be able to find definitions quickly and easily. plus it makes it easier to maintain and debug as opposed to being crammed somewhere inside the code. – gold_cy Jun 27 '19 at 11:51

1 Answers1

0

kind of a hack, but you can use some sort of decorator to mimic static function variables.

try this:

import re


def static_vars(**kwargs):
    def decorate(func):
        for k in kwargs:
            setattr(func, k, kwargs[k])
        return func

    return decorate


@static_vars(expr=re.compile('.*(specific|work)_stuff.*'))
def is_specific_stuff(line):
    return is_specific_stuff.expr.match(line) is not None


print(is_specific_stuff("hello"))
print(is_specific_stuff("many important work_stuff to do"))

the usage becomes a little verbose, but it keeps it scoped to the function, and defined close to it for readability.

NOTE: you could also use a parameter with a default value (and then use it normally), this also only compiles the re once, but the problem is that some caller can override that value when calling the function...

Adam.Er8
  • 12,675
  • 3
  • 26
  • 38