2

Working with python3, I had a requirement:

  • Perform some pre-work
  • Do the core work
  • Cleanup the pre-work

Taking inspiration from fixtures in pytest I came across this post and wrote some crazy code.

Though this crazy code works, I wish to understand the yield sorcery that makes it working :)

def db_connect_n_clean():
  db_connectors = []
  def _inner(db_obj):
    db_connectors.append(db_obj)
    print("Connect : ", db_obj)
  yield _inner
  for conn in db_connectors:
    print("Dispose : ", conn)

This is the driver code:

pre_worker = db_connect_n_clean()
freaky_function = next(pre_worker)
freaky_function("1")
freaky_function("2")
try:
  next(pre_worker)
except:
  pass

It produces this output:

Connect :  1
Connect :  2
Dispose :  1
Dispose :  2
Traceback (most recent call last):
  File "junk.py", line 81, in <module>
    next(pre_worker)
StopIteration

What confuses me in this code is, that all the calls to the same generator freaky_func is maintaining a single list of db_connectors

After the first yield, all the objects are disposed and I hit StopIteration

I was thinking that calling freaky_func twice would maintain 2 separate lists and there would be 2 separate yields

Update: The goal of this question is not to understand how to achieve this. As it is evident from the comments, context-manager is the way to go. But my question is to understand how this piece of code is working. Basically, the python side of it.

Bharel
  • 23,672
  • 5
  • 40
  • 80
mittal
  • 915
  • 10
  • 29
  • Are you trying to re-invent [context managers](https://www.python.org/dev/peps/pep-0343/) (a.k.a. `with` blocks)? Your task description sure sounds that way. – Tomalak Dec 21 '21 at 13:16
  • @Tomalak I agree that using context managers is a cleaner way. But I am just trying to understand how this code is working. I mean, calling the function multiple times is appending to the same list and yields multiple times. But it resumes only once ? – mittal Dec 21 '21 at 13:52
  • Anyone has insights on this ? – mittal Jan 19 '22 at 10:30
  • 3
    *"I was thinking that calling `freaky_func` twice would maintain 2 separate lists and there would be 2 separate `yield`s"* - what makes you think that? Nothing in `freaky_func` creates a new list. And nothing in `db_connect_n_clean` yields twice. There is exactly one `yield`, and it is hit exactly once, returning the `_inner` function (which you chose to call `freaky_function` outside) and which adds values to the same list. Trying `next()` once more moves the execution past the `yield`, where your "disposal" loop sits, and then it reaches the end of the function, throwing `StopIteration`. – Tomalak Jan 19 '22 at 10:57
  • 2
    There is nothing freaky or strange or magical about it. Think of `yield` as a bookmark. Before calling `next()` the very first time, `db_connect_n_clean` is paused, and the bookmark sits right in front of the first line of the function body. You call `next()`, the function is unpaused and runs up to the next `yield` from this position. After that, the function is paused again, and the bookmark sits right before the next line after the `yield`. You call `next()` again, the function unpauses and runs to the next `yield` (or to the end). – Tomalak Jan 19 '22 at 11:02
  • 1
    All that being said, use context managers. Your `db_connect_n_clean` function might be educational, but it's not headed into a place you want to be in. Virtually every database module in Python offers context managers which are designed for this kind of "set up and tear down when done" scenario. Writing your own context managers is not hard, either, no reason to re-invent them (example https://realpython.com/python-with-statement/#creating-custom-context-managers). – Tomalak Jan 19 '22 at 11:06
  • Note that what you have there is a coroutine (well, every generator is a coroutine – but you're actually using it as one) so you might [want to read up on ``async`` in Python](https://stackoverflow.com/questions/49005651/how-does-asyncio-actually-work), since that's the exact same *mechanism* with nicer syntax (for the full-blown coroutine case, that is). – MisterMiyagi Jan 19 '22 at 11:45

1 Answers1

2

One of my favorite tools to visualize Python with is PythonTutor.

Basically, you can see that on the first run next(pre_worker) returns the _inner function. Since _inner is inside db_connect_n_clean, it can access all of its variables.

Internally, in Python, _inner contains a reference to db_connectors. You can see the reference under __closure__:

>>> gen = db_connect_n_clean()
>>> inner = next(gen)
>>> inner.__closure__
(<cell at 0x000001B73FE6A3E0: list object at 0x000001B73FE87240>,)
>>> inner.__closure__[0].cell_contents
[]

The name of the reference is the same as the variable:

>>> inner.__code__.co_freevars
('db_connectors',)

Every time this specific function, with this specific __closure__ tries to access the db_connectors, it goes to the same list.

>>> inner(1)
Connect :  1
>>> inner(2)
Connect :  2
>>> inner.__closure__[0].cell_contents
[1, 2]

The original generator gen() is still paused at the first yield:

>>> gen.gi_frame.f_lineno
6  # Gen is stopped at line #6
>>> gen.gi_frame.f_locals["db_connectors"]
[1, 2]

When you advance it again using next() it continues on from the yield and closes everything:

>>> next(gen)
Dispose :  1
Dispose :  2
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
StopIteration

If you wish to understand how do generators work in general, there are plenty of answers and articles on the subject. I wrote this one for example.

If I didn't fully explain the situation, feel free to ask for clarification in the comments!

Bharel
  • 23,672
  • 5
  • 40
  • 80
  • Thank you for the link to visual python. I believe there is a minor typo in your reply `next(a)` should be `next(gen)` – mittal Jan 19 '22 at 13:32