5

EDIT: As pointed out by Thierry Lathuille, PEP567, where ContextVar was introduced, was not designed to address generators (unlike the withdrawn PEP550). Still, the main question remains. How do I write stateful context managers that behave correctly with multiple threads, generators and asyncio tasks?


I have a library with some functions that can work in different "modes", so their behavior can be altered by a local context. I am looking at the contextvars module to implement this reliably, so I can use it from different threads, asynchronous contexts, etc. However, I am having trouble getting a simple example working right. Consider this minimal setup:

from contextlib import contextmanager
from contextvars import ContextVar

MODE = ContextVar('mode', default=0)

@contextmanager
def use_mode(mode):
    t = MODE.set(mode)
    try:
        yield
    finally:
        MODE.reset(t)

def print_mode():
   print(f'Mode {MODE.get()}')

Here is a small test with a generator function:

def first():
    print('Start first')
    print_mode()
    with use_mode(1):
        print('In first: with use_mode(1)')
        print('In first: start second')
        it = second()
        next(it)
        print('In first: back from second')
        print_mode()
        print('In first: continue second')
        next(it, None)
        print('In first: finish')

def second():
    print('Start second')
    print_mode()
    with use_mode(2):
        print('In second: with use_mode(2)')
        print('In second: yield')
        yield
        print('In second: continue')
        print_mode()
        print('In second: finish')

first()

I get the following output:

Start first
Mode 0
In first: with use_mode(1)
In first: start second
Start second
Mode 1
In second: with use_mode(2)
In second: yield
In first: back from second
Mode 2
In first: continue second
In second: continue
Mode 2
In second: finish
In first: finish

In the section:

In first: back from second
Mode 2
In first: continue second

It should be Mode 1 instead of Mode 2, because this was printed from first, where the applying context should be, as I understand it, use_mode(1). However, it seems that the use_mode(2) of second is stacked over it until the generator finishes. Are generators not supported by contextvars? If so, is there any way to support stateful context managers reliably? By reliably, I mean it should behave consistently whether I use:

  • Multiple threads.
  • Generators.
  • asyncio
jsbueno
  • 99,910
  • 10
  • 151
  • 209
jdehesa
  • 58,456
  • 7
  • 77
  • 121
  • 1
    About generators and contextvars, have a look at [PEP 567 -- Context Variables](https://www.python.org/dev/peps/pep-0567/#abstract) that explicitely states that "this PEP is concerned only with solving the case for asynchronous tasks, not for generators." – Thierry Lathuille Dec 04 '18 at 11:23
  • 1
    @ThierryLathuille I see, thanks. I suppose that means `ContextVar` is not supposed to work for generators? Since the documentation just generally states "Context managers that have state should use Context Variables (...)", I think it may be misleading... And also, it's a bit disappointing that PEP 567 acknowledges the limitation for generators, which [PEP 550](https://www.python.org/dev/peps/pep-0550) did not have, while not offering any workaround. So I guess the question remains, how to write stateful context managers that always work. – jdehesa Dec 04 '18 at 11:38
  • 1
    [PEP 568](https://www.python.org/dev/peps/pep-0568/) extends PEP 567's machinery to add generator context sensitivity – rectalogic Jan 25 '20 at 15:38

1 Answers1

2

You've actually got an "interlocked context" there - without returning the __exit__ part for the second function it will not restore the context with ContextVars, no matter what.

So, I came up with something here - and the best thing I could think of is a decorator to explicit declare which callables will have their own context - I created a ContextLocal class which works as a namespace, just like thread.local - and attributes in that namespace should behave properly as you expect.

I am finishing the code now - so I had not tested it yet for async or multi-threading, but it should work. If you can help me write a proper test, the solution below could become a Python package in itself.

(I had to resort to injecting an object in generator and co-routines frames locals dictionary in order to clean up the context registry once a generator or co-routine is over - there is PEP 558 formalizing the behavior of locals() for Python 3.8+, and I don't remember now if this injection is allowed - it works up to 3.8 beta 3, though, so I think this usage is valid).

Anyway, here is the code (named as context_wrapper.py):

"""
Super context wrapper -

meant to be simpler to use and work in more scenarios than
Python's contextvars.

Usage:
Create one or more project-wide instances of "ContextLocal"
Decorate your functions, co-routines, worker-methods and generators
that should hold their own states with that instance's `context` method -

and use the instance as namespace for private variables that will be local
and non-local until entering another callable decorated
with `intance.context` - that will create a new, separated scope
visible inside  the decorated callable.


"""

import sys
from functools import wraps

__author__ = "João S. O. Bueno"
__license__ = "LGPL v. 3.0+"

class ContextError(AttributeError):
    pass


class ContextSentinel:
    def __init__(self, registry, key):
        self.registry = registry
        self.key = key

    def __del__(self):
        del self.registry[self.key]


_sentinel = object()


class ContextLocal:

    def __init__(self):
        super().__setattr__("_registry", {})

    def _introspect_registry(self, name=None):

        f = sys._getframe(2)
        while f:
            h = hash(f)
            if h in self._registry and (name is None or name in self._registry[h]):
                return self._registry[h]
            f = f.f_back
        if name:
            raise ContextError(f"{name !r} not defined in any previous context")
        raise ContextError("No previous context set")


    def __getattr__(self, name):
        namespace = self._introspect_registry(name)
        return namespace[name]


    def __setattr__(self, name, value):
        namespace = self._introspect_registry()
        namespace[name] = value


    def __delattr__(self, name):
        namespace = self._introspect_registry(name)
        del namespace[name]

    def context(self, callable_):
        @wraps(callable_)
        def wrapper(*args, **kw):
            f = sys._getframe()
            self._registry[hash(f)] = {}
            result = _sentinel
            try:
                result = callable_(*args, **kw)
            finally:
                del self._registry[hash(f)]
                # Setup context for generator or coroutine if one was returned:
                if result is not _sentinel:
                    frame = getattr(result, "gi_frame", getattr(result, "cr_frame", None))
                    if frame:
                        self._registry[hash(frame)] = {}
                        frame.f_locals["$context_sentinel"] = ContextSentinel(self._registry, hash(frame))

            return result
        return wrapper

Here is the modified version of your example to use with it:

from contextlib import contextmanager

from context_wrapper import ContextLocal

ctx = ContextLocal()


@contextmanager
def use_mode(mode):
    ctx.MODE = mode
    print("entering use_mode")
    print_mode()
    try:
        yield
    finally:

        pass

def print_mode():
   print(f'Mode {ctx.MODE}')


@ctx.context
def first():
    ctx.MODE = 0
    print('Start first')
    print_mode()
    with use_mode(1):
        print('In first: with use_mode(1)')
        print('In first: start second')
        it = second()
        next(it)
        print('In first: back from second')
        print_mode()
        print('In first: continue second')
        next(it, None)
        print('In first: finish')
        print_mode()
    print("at end")
    print_mode()

@ctx.context
def second():
    print('Start second')
    print_mode()
    with use_mode(2):
        print('In second: with use_mode(2)')
        print('In second: yield')
        yield
        print('In second: continue')
        print_mode()
        print('In second: finish')

first()

Here is the output of running that:

Start first
Mode 0
entering use_mode
Mode 1
In first: with use_mode(1)
In first: start second
Start second
Mode 1
entering use_mode
Mode 2
In second: with use_mode(2)
In second: yield
In first: back from second
Mode 1
In first: continue second
In second: continue
Mode 2
In second: finish
In first: finish
Mode 1
at end
Mode 1

(it will be slower than native contextvars by orders of magnitude as those are built-in Python runtime native code - but it seems easier to wrap-the-mind around to use by the same amount)

Jacob Lee
  • 4,405
  • 2
  • 16
  • 37
jsbueno
  • 99,910
  • 10
  • 151
  • 209
  • That is very impressive! It actually makes sense, the vars that you want to use at each time are associated to the stack frame that you are in. The additional decorators are not too much trouble I think, as they would only need to be used by the writers of the library, not the users. – jdehesa Aug 12 '19 at 09:16
  • Is there any reason why you use frame hashes instead of frames themselves? Would keeping references to frame be problematic, even if you clear them after? – jdehesa Aug 12 '19 at 10:38
  • 1
    There is no "call back" after a frame is done with. In the case of plain functions, it takes place on the function return, ok - but if I'd keep the frame hashes around, they would never be de-referenced, and the code could not know about it. That is usually done through weakreferences, but Frame objects cannot be weak-referenced - this mechanism is a "poor man's" weak reference. – jsbueno Aug 12 '19 at 12:54
  • I am about to publish this to Pypi - I just want to improve a bit further the docs - including how it provides support for "dynamic scoping" - https://en.wikipedia.org/wiki/Scope_(computer_science)#Dynamic_scoping – jsbueno Aug 12 '19 at 13:53
  • I was playing with this a bit. I made an alternative version with a few modifications. First it has default values (in a default namespace). Also, changed usage to be similar to `contextlib.contextmanager`. There are a few kinks (no thread-safe, and it depends depends on `functools.wraps` and `contextlib.contextmanager` so if those add additional intermediate calls it would break), but nothing unsolvable I think. It limits usage to generator-like context managers though. See what you think of it [`extracontext_mod.py`](https://gist.github.com/javidcf/e4ee4c16ee9a66001d8f24d8d1dbd11f). – jdehesa Aug 12 '19 at 14:42
  • (note the nested `with_usemode` in the example does not work correctly in the first version) – jdehesa Aug 12 '19 at 14:44
  • Added also a `newcontext` thing so you can stack a new namespace on top of the current one without any additional context managers or decorators (see example). Note sure if it is alright to use `ContextLocal` both as a namespace (through `__getattr__`/`__setattr__`) and as an object with public named methods / properties but well that's API design to think about. – jdehesa Aug 12 '19 at 15:00
  • I was thinking of having a single namespace inside ContextLocal - so that methods like `pop` and `search` could be added. For now the only public name there is `context`. And also, I'd like to support other things than generators-as-context managers - but we could keep both APIs. (I had devised a way to have all methods in an instance to share the same context, so that tradicional context managers with `__enter__` and `__exit__` would work) – jsbueno Aug 12 '19 at 15:30