1

I am trying to produce a better answer to the frequently-asked question "How do I do function-local static variables in Python?" (1, 2, 3, ...) "Better" means completely encapsulated in a decorator, that can be used in any context where a function definition may appear. In particular, it must DTRT when applied to methods and nested functions; it must play nice with other decorators applied to the same function (in any order); it must accept arbitrary initializers for the static variables, and it must not modify the formal parameter list of the decorated function. Basically, if this were to be proposed for inclusion in the standard library, nobody should be able to object on quality-of-implementation grounds.

Ideal surface syntax would be

@static_vars(a=0, b=[])
def test():
    b.append(a)
    a += 1
    sys.stdout.write(repr(b) + "\n")

I would also accept

@static_vars(a=0, b=[])
def test():
    static.b.append(static.a)
    static.a += 1
    sys.stdout.write(repr(static.b) + "\n")

or similar, as long as the namespace for the static variables is not the name of the function! (I intend to use this in functions that may have very long names.)

A slightly more motivated example involves precompiled regular expressions that are only relevant to one function:

@static_vars(encode_re = re.compile(
        br'[\x00-\x20\x7F-\xFF]|'
        br'%(?!(?:[0-9A-Fa-f]{2}|u[0-9A-Fa-f]{4}))')
def encode_nonascii_and_percents(segment):
    segment = segment.encode("utf-8", "surrogateescape")
    return encode_re.sub(
        lambda m: "%{:02X}".format(ord(m.group(0))).encode("ascii"),
        segment).decode("ascii")

Now, I already have a mostly-working implementation. The decorator rewrites each function definition as if it had read like so (using the first example):

def _wrap_test_():
    a = 0
    b = 1
    def test():
        nonlocal a, b
        b.append(a)
        a += 1
        sys.stdout.write(repr(b) + "\n")
test = _wrap_test_()
del _wrap_test_

It seems that the only way to accomplish this is to munge the AST. I have code that works for simple cases (see below) but I strongly suspect it is wrong in more complicated cases. For instance, I think it will break if applied to a method definition, and of course it also breaks in any situation where inspect.getsource() fails.

So the question is, first, what should I do to make it work in more cases, and second, is there a better way to define a decorator with the same black-box effects?

Note 1: I only care about Python 3.

Note 2: Please assume that I have read all of the proposed solutions in all of the linked questions and found all of them inadequate.

#! /usr/bin/python3

import ast
import functools
import inspect
import textwrap

def function_skeleton(name, args):
    """Return the AST of a function definition for a function named NAME,
       which takes keyword-only args ARGS, and does nothing.  Its
       .body field is guaranteed to be an empty array.
    """

    fn = ast.parse("def foo(*, {}): pass".format(",".join(args)))

    # The return value of ast.parse, as used here, is a Module object.
    # We want the function definition that should be the Module's
    # sole descendant.
    assert isinstance(fn, ast.Module)
    assert len(fn.body) == 1
    assert isinstance(fn.body[0], ast.FunctionDef)
    fn = fn.body[0]

    # Remove the 'pass' statement.
    assert len(fn.body) == 1
    assert isinstance(fn.body[0], ast.Pass)
    fn.body.clear()

    fn.name = name
    return fn

class static_vars:
    """Decorator which provides functions with static variables.
       Usage:

           @static_vars(foo=1, bar=2, ...)
           def fun():
               foo += 1
               return foo + bar

       The variables are implemented as upvalues defined by a wrapper
       function.

       Uses introspection to recompile the decorated function with its
       context changed, and therefore may not work in all cases.
    """

    def __init__(self, **variables):
        self._variables = variables

    def __call__(self, func):
        if func.__name__ in self._variables:
            raise ValueError(
                "function name {} may not be the same as a "
                "static variable name".format(func.__name__))

        fname = inspect.getsourcefile(func)
        lines, first_lineno = inspect.getsourcelines(func)

        mod = ast.parse(textwrap.dedent("".join(lines)), filename=fname)

        # The return value of ast.parse, as used here, is a Module
        # object.  Save that Module for use later and extract the
        # function definition that should be its sole descendant.
        assert isinstance(mod, ast.Module)
        assert len(mod.body) == 1
        assert isinstance(mod.body[0], ast.FunctionDef)
        inner_fn = mod.body[0]
        mod.body.clear()

        # Don't apply decorators twice.
        inner_fn.decorator_list.clear()

        # Fix up line numbers.  (Why the hell doesn't ast.parse take a
        # starting-line-number argument?)
        ast.increment_lineno(inner_fn, first_lineno - inner_fn.lineno)

        # Inject a 'nonlocal' statement declaring the static variables.
        svars = sorted(self._variables.keys())
        inner_fn.body.insert(0, ast.Nonlocal(svars))

        # Synthesize the wrapper function, which will take the static
        # variableas as arguments.
        outer_fn_name = ("_static_vars_wrapper_" +
                         inner_fn.name + "_" +
                         hex(id(self))[2:])
        outer_fn = function_skeleton(outer_fn_name, svars)
        outer_fn.body.append(inner_fn)
        outer_fn.body.append(
            ast.Return(value=ast.Name(id=inner_fn.name, ctx=ast.Load())))

        mod.body.append(outer_fn)
        ast.fix_missing_locations(mod)

        # The new function definition must be evaluated in the same context
        # as the original one.  FIXME: supply locals if appropriate.
        context = func.__globals__
        exec(compile(mod, filename="<static-vars>", mode="exec"),
             context)

        # extract the function we just defined
        outer_fn = context[outer_fn_name]
        del context[outer_fn_name]

        # and call it, supplying the static vars' initial values; this
        # returns the adjusted inner function
        adjusted_fn = outer_fn(**self._variables)
        functools.update_wrapper(adjusted_fn, func)
        return adjusted_fn

if __name__ == "__main__":
    import sys

    @static_vars(a=0, b=[])
    def test():
        b.append(a)
        a += 1
        sys.stdout.write(repr(b) + "\n")

    test()
    test()
    test()
    test()
zwol
  • 135,547
  • 38
  • 252
  • 361

1 Answers1

0

Isn't this what classes are for?

import sys

class test_class:
    a=0
    b=[]

    def test(self):
        test_class.b.append(test_class.a)
        test_class.a += 1
        sys.stdout.write(repr(test_class.b) + "\n")

t = test_class()
t.test()
t.test()

[0] [0, 1]

Here is a version of your regexp encoder:

import re

class encode:
    encode_re = re.compile(
        br'[\x00-\x20\x7F-\xFF]|'
        br'%(?!(?:[0-9A-Fa-f]{2}|u[0-9A-Fa-f]{4}))')

    def encode_nonascii_and_percents(self, segment):
        segment = segment.encode("utf-8", "surrogateescape")
        return encode.encode_re.sub(
            lambda m: "%{:02X}".format(ord(m.group(0))).encode("ascii"),
            segment).decode("ascii")

e = encode()
print(e.encode_nonascii_and_percents('foo bar'))

foo%20bar

There is always the singleton class.

Is there a simple, elegant way to define Singletons in Python?

Community
  • 1
  • 1
Brent Washburne
  • 12,904
  • 4
  • 60
  • 82
  • Does not satisfy a hard design constraint, stated in the question: *"Better" means: you don't have to qualify each use of a static variable.* – zwol Dec 07 '15 at 21:32
  • I follow the Zen of Python https://www.python.org/dev/peps/pep-0020/ which states "Explicit is better than implicit" (not black-box effects). But the interpretation of "better" is entirely up to you. – Brent Washburne Dec 07 '15 at 21:43
  • I thought about this some more and I have a better reason for not liking this: it can't be used as-is in an arbitrary context. You would have to jump through several extra hoops to apply this treatment to a method, for instance. (Yes, I do want this for methods -- again, think regexes used in only one method.) I'd accept an answer that used a singleton class under the hood, as long as the surface syntax was a decorator that could be applied to _any_ function definition, regardless of context. – zwol Dec 08 '15 at 00:29
  • I have revised the question to make it clearer what I'm looking for. – zwol Dec 08 '15 at 00:42
  • I think you should post your revision as an answer! – Brent Washburne Dec 08 '15 at 00:47
  • It doesn't fully solve the problem yet! The question I'm asking is immediately above the notes. – zwol Dec 08 '15 at 00:49
  • Well, at least it's a partial answer. I don't think I can help any more than this: we're both using a class to hold the variables, might as well keep it simple and Pythonic. – Brent Washburne Dec 08 '15 at 00:59