3

When I run the following Python3 code,

def f():
  x = 'foo'
  def g():
    return x
  def h(y):
    nonlocal x
    x = y
  return g, h
g, h = f()
print(g())
h('bar')
print(g())

I get

foo
bar

I had believed that in Python, all local variables are essentially pointers. So in my mind, x was a pointer allocated on the stack when f() is called, so when f() exits, the variable x should die. Since the string 'foo' was allocated on the heap, when g() is called, I thought "ok, I guess g() kept a copy of the pointer to 'foo' as well". But then I could call h('bar'), the value that g() returns got updated.

Where does the variable x live? Is my model of local variables in Python all wrong?

EDIT:

@chepner has pretty much answered my question in the comments. There's one where he says that local variables are stored in a dict-like object, and then another where he links https://docs.python.org/3.4/reference/executionmodel.html#naming-and-binding, to support his claim.

At this point I am pretty happy. If chepner were to write an answer rehashing his comments I would accept it as best answer. Otherwise, consider this question resolved.

math4tots
  • 8,540
  • 14
  • 58
  • 95
  • 1
    that is what a closure is, if a function closes over any objects it maintains a reference to those objects – Padraic Cunningham Jul 29 '14 at 19:11
  • 1
    _"so when `f()` exits, the variable `x` should die."_ Nah. variables live past the end of the function they were created in, provided a reference to them still exists. `x` will live at least as long as `g` and `h`. – Kevin Jul 29 '14 at 19:15
  • @Kevin So does Python not allocate variables on a stack then? Because doing so would seem to force the order in which those variables die. – math4tots Jul 29 '14 at 19:16
  • 1
    [this post](http://stackoverflow.com/questions/14546178/does-python-have-a-stack-heap-and-how-is-memory-managed) seems to indicate that all variables are allocated on the heap. (for the usual CPython implementation of Python, at least.) – Kevin Jul 29 '14 at 19:19
  • is you `print (g.__closure__)` you will see a reference to x – Padraic Cunningham Jul 29 '14 at 19:21
  • @PadraicCunningham So is some sort of `closure` object created that contains another pointer that doubles as the original `x`? Or are variables actually pointers to pointers so that the closure just collects the pointers before the pointers to pointers die? – math4tots Jul 29 '14 at 19:30
  • @Kevin From a cursory glance, that post seems to suggest to me that objects in Python are allocated on the heap, like in Java. However in the Java I'm familiar with, when you have a function inside a function (e.g. through anonymous classes), you can only capture final variables. In Java, even though objects are allocated on the heap, "variables" (the pointers to the objects on the heap) are still allocated on the stack. – math4tots Jul 29 '14 at 19:32
  • @metaperture That is how I original thought of it somewhat. There is a pointer `a` allocated on the stack, and through assignment you get it to point to other things. But the question I have here is what happens when the function returns and that part of the stack goes away. – math4tots Jul 29 '14 at 19:38
  • @math4tots, as far as I understand, the local function "closes over" the objects it needs and prevents the objects from being garbage collected, it is the special `__closure__` method that keeps the object alive and maintains the references to the objects – Padraic Cunningham Jul 29 '14 at 19:39
  • 1
    Python doesn't use the stack the same way you are thinking of. Local variables aren't stored on the stack; they are stored in the heap. What you are thinking of as a stack frame is really a dict-like object that is *also* stored on the heap that contains references to the local variables. When `f` returns, its *reference* to that dict-like object goes away, but if another object (like `g`) maintains a reference to it, it remains on the heap as long as that reference exists. – chepner Jul 29 '14 at 19:41
  • @chepner Ah I see. I guess that answers my question. Though, after having one model of execution be debunked, I am somewhat skeptical; is there some documentation you could point me to that verifies this? I would rather not blindly dig through the source without at least some guidance on where to look... – math4tots Jul 29 '14 at 19:43
  • @math4tots sorry for the confusion, my answer worked completely differently in py3 than py2 – metaperture Jul 29 '14 at 19:46
  • 1
    https://docs.python.org/3.4/reference/executionmodel.html#naming-and-binding probably answers your question, but requires close attention to the definitions. – chepner Jul 29 '14 at 19:51

1 Answers1

1

@chepner answered my question ages ago in the comments.

I would've accepted @chepner's answer if he posted one. But it's been almost a year, and I think it's fair for me to just post the answer myself now, and accept it.


I asked the question originally, because I didn't understand what was stored in the stack frame in Python.

In Java, if you have code that looks like:

void f() {
  Integer x = 1;
  Integer y = 2;
}

Memory is allocated on the stack for two pointers.

However, in Python, if you have code like

def f():
  x = 1
  y = 2

No extra memory is allocated on the stack.

In pseudo-C++, the python code above is more analogous to

void f() {
  PythonDict * scope = MakePythonDict();
  scope->set("x", MakeInt(1));
  scope->set("y", MakeInt(2));
}

So for my original question, the variable x stays alive because it lives in the dict pointed to by scope which gets passed around in the closure.

Notice that the scope pointer still dies when the function exits.

Ethan Furman
  • 63,992
  • 20
  • 159
  • 237
math4tots
  • 8,540
  • 14
  • 58
  • 95