Do Python interpreter resolve variable references when a function is defined but not called?

Question

First of all, this post does NOT answer my question or give me any guide to answer my question at all.

My question is about mechanism function resolving non-local variables.

Code

# code block 1
def func():
    vals = [0, 0, 0]
    other_vals = [7, 8, 9]
    other = 12

    def func1():
        vals[1] += 1
        print(vals)

    def func2():
        vals[2] += 2
        print vals

    return (func1, func2)

f1, f2 = func()

Try to run f1, f2:

>>> f1()
[0, 1, 0]
>>> f2
[0, 1, 2]

This shows that the object previously referred by vals are shared by f1 and f2, and not garbage collected after execution of func.

Will objects referred by other_vals and other be garbage collected? I think so. But how does Python decide not to garbage collect vals?

Assumption 1

Python interpreter will resolve variable names within func1 and func2 to figure out references inside the function, and increase the reference count of [0, 0, 0] by 1 preventing it from garbage collection after the func call.

But if I do

# code block 2
def outerfunc():
    def innerfunc():
        print(non_existent_variable)
f = outerfunc()

No error reported. Further more

# code block 3
def my_func():
    print(yet_to_define)
yet_to_define = "hello"

works.

Assumption 2 Variable names are resolved dynamically at run time. This makes observations in code block 2 and 3 easy to explain, but how did the interpreter know it need to increase reference count of [0, 0, 0] in code block 1?

Which assumption is correct?

https://en.wikipedia.org/wiki/Garbage_collection_(computer_science) — spectras, Dec 19 '15 at 06:17
(also worth reading, though I don't know if it carries well to newer versions: https://docs.python.org/release/2.5.2/ext/refcounts.html ) — spectras, Dec 19 '15 at 06:18
I agree that the dupe target doesn't adequately answer your question. Hopefully, it'll get re-opened. — PM 2Ring, Dec 19 '15 at 07:39
If I understand the question, the linked dupe doesn't really answer it. I think what the OP is looking for is actually answered in Ned Batchelder's blog post [Facts and myths about Python names and values](http://nedbatchelder.com/text/names.html). I'm pretty sure this question is still a duplicate, though. Perhaps [this question](http://stackoverflow.com/questions/20246523/how-references-to-variables-are-resolved-in-python) would be a better dupe? — Daniel Pryden, Dec 19 '15 at 07:41
In the mean time, I'll give a quick summary here. Your 1st example creates a closure, so the interpreter stores a reference to `vals` in the returned function objects `func1` and `func2`, and that ref prevents `vals` from being garbage collected. See [What exactly is contained within a obj.__closure__?](http://stackoverflow.com/q/14413946/4014959). In your 2nd example `non_existent_variable` is presumably a global, and you _will_ get an error if you don't define it before calling `f`. — PM 2Ring, Dec 19 '15 at 07:41
http://stackoverflow.com/questions/13857/can-you-explain-closures-as-they-relate-to-python may also be of interest. Note the use of the "new" `nonlocal` keyword in J.F. Sebastian's answer, which allows reassignment to a closed-over name. — PM 2Ring, Dec 19 '15 at 07:45
@DanielPryden: That other question doesn't really discuss the garbage collection issue, but it _is_ full of excellent information. — PM 2Ring, Dec 19 '15 at 07:53

score 3 · Accepted Answer · edited May 23 '17 at 10:27

Your first example creates a closure; also see Why aren't python nested functions called closures?, Can you explain closures (as they relate to Python)?, and What exactly is contained within a obj.__closure__?.

The closure mechanism ensures that the interpreter stores a reference to vals in the returned function objects func1 and func2. Your Assumption 1 is correct: that reference prevents vals from being garbage collected when func returns.

In your second example, the interpreter cannot see a reference to non_existent_variable in the enclosing scope(s), but that's ok because your Assumption 2 is also correct, so you're free to use names that haven't yet been bound to objects at function declaration time, so long as the name is in scope when you actually call the function.

The answer to "how did the interpreter know it need to increase reference count of [0, 0, 0] in code block 1?" is that the closure mechanism is an explicit thing the interpreter does when it executes a function definition, i.e., when it's creating a function object from the function definition in your script.

Every Python function object (both normal def-style functions and lambdas) has an attribute to store this closure information, with a minor difference between Python 2 and Python 3. See the links at the start of this answer for details, but I will mention here that Python 3 provides the nonlocal keyword, which works a bit like the global keyword: nonlocal allows you to make assignments to closed-over simple variables; J.F. Sebastian's answer has a simple example illustrating the use of nonlocal.

Note that with nested functions the inner function definitions are processed each time you call the outer function, which allows you to do things like:

def func(vals):
    def func1():
        vals[1] += 1
        print(vals)

    def func2():
        vals[2] += 2
        print(vals)

    return func1, func2

f1, f2 = func([0, 0, 0])
f1()
f2()

f1, f2 = func([10, 20, 30])
f1()
f2()

output

[0, 1, 0]
[0, 1, 2]
[10, 21, 30]
[10, 21, 32]

Learned a lot. Thank you! – Frozen Flame Dec 19 '15 at 11:26 — Frozen Flame, Dec 19 '15 at 11:26

Do Python interpreter resolve variable references when a function is defined but not called?

1 Answers1