Does Python have a stack/heap and how is memory managed?

Question

How are variables and memory managed in Python? Does it have a stack and a heap and what algorithm is used to manage memory? Given this knowledge are there any recommendations on memory management for large number/data crunching?

You might want to have a read of the following two: http://foobarnbaz.com/2012/07/08/understanding-python-variables/ http://docs.python.org/2/c-api/memory.html — user1690293, Jan 27 '13 at 09:53
Is there some specific issue with Python var/memory management that you are having a problem with and is not trivially discovered by the Python documentation and/or Googling? — Martin James, Jan 27 '13 at 09:54

score 141 · Accepted Answer · edited Jan 24 '18 at 03:36

How are variables and memory managed in Python.

Automagically! No, really, you just create an object and the Python Virtual Machine handles the memory needed and where it shall be placed in the memory layout.

Does it have a stack and a heap and what algorithm is used to manage memory?

When we are talking about CPython it uses a private heap for storing objects. From the CPython C API documentation:

Memory management in Python involves a private heap containing all Python objects and data structures. The management of this private heap is ensured internally by the Python memory manager. The Python memory manager has different components which deal with various dynamic storage management aspects, like sharing, segmentation, preallocation or caching.

Memory reclamation is mostly handled by reference counting. That is, the Python VM keeps an internal journal of how many references refer to an object, and automatically garbage collects it when there are no more references referring to it. In addition, there is a mechanism to break circular references (which reference counting can't handle) by detecting unreachable "islands" of objects, somewhat in reverse of traditional GC algorithms that try to find all the reachable objects.

NOTE: Please keep in mind that this information is CPython specific. Other python implementations, such as pypy, iron python, jython and others may differ from one another and from CPython when it comes to their implementation specifics. To understand that better, it may help to understand that there is a difference between Python the semantics (the language) and the underlying implementation

Given this knowledge are there any recommendations on memory management for large number/data crunching?

Now I can not speak about this, but I am sure that NumPy (the most popular python library for number crunching) has mechanisms that handle memory consumption gracefully.

If you would like to know more about Python's Internals take a look at these resources:

Stepping through CPython (video)
A presentation about the internals of the Python Virtual Machine
In true hacker spirit, the CPython Object Allocator source code

Good of you you stress the distinction of Python vs CPython ;) — phant0m, Feb 27 '13 at 17:31
Note that *local variables* will have the actual variables stored in the equivalent of a stack frame. — Marcin, Jul 20 '13 at 18:45
Python isn't Java; it doesn't have a virtual machine; it has an interpreter. It may seem pedantic to point this out but they are two different paradigms and the difference has important implications for how code is compiled and run. https://stackoverflow.com/questions/441824/java-virtual-machine-vs-python-interpreter-parlance — Apollo2020, Mar 10 '19 at 21:11
@Apollo2020 The Cpython interpreter contains within itself its own virtual machine implementation, so yes, it is a VM, you're right that it isn't the JVM, but it is still a VM. At runtime, all your python code will be first "compiled" into bytecode native to the Cpython VM, that is why `.pyc` files exist, they are bytecode for the Cpython VM which has been saved in a file to save the need to re-compile it. — ThisGuyCantEven, May 31 '23 at 17:34
NumPy is simply a wrapper for native LAPack (IE OpenBLAS,MKL) which has its own memory managment outside of CPython. This is 100% a requirement because the python VM's reference-counting operations are not atomic and therefore not thread-safe (which is why the GIL exists). If Numpy used Cpython's memory management under the hood, it would be terribly slow because it would be GIL-bound could not parallelize numerical operations. — ThisGuyCantEven, May 31 '23 at 17:41
@Apollo2020 Also, python is not a runtime or interpreter, it is a language. The interpreters are just applications that parse your python code into an AST, then use that AST to emit bytecode for whatever runtime is being used (`.pyc` for CPython, probably `.CLASS` for Jython, because `.CLASS` files are the bytecode of the JVM). Even Pypy eventually emits bytecode to run on LLVM (also uses another intermediate representation in between). — ThisGuyCantEven, May 31 '23 at 17:46

phant0m · Answer 2 · 2013-01-27T11:24:16.377

Python doesn't have any such thing.

Python is the language and does not specify how exactly implementations must achieve the semantics defined by Python the language.

Every implementation (CPython, PyPy, IronPython, Stackless, Jython...) is free to do its own thing!

In CPython, all objects live on the heap:

Memory management in Python involves a private heap containing all Python objects and data structures.¹

The CPython virtual machine is stack based:

>>> def g():
    x = 1
    y = 2
    return f(x, y)

>>> import dis
>>> dis.dis(g)
  2           0 LOAD_CONST           1 (1) # Push 1 onto the stack
              3 STORE_FAST           0 (x) # Stores top of stack into local var x

  3           6 LOAD_CONST           2 (2) # Push 2 onto stack
              9 STORE_FAST           1 (y) # Store TOS into local var y

  4          12 LOAD_GLOBAL          0 (f) # Push f onto stack
             15 LOAD_FAST            0 (x) # Push x onto stack
             18 LOAD_FAST            1 (y) # Push y onto stack
             21 CALL_FUNCTION        2     # Execute function with 2 
                                           # f's return value is pushed on stack
             24 RETURN_VALUE               # Return TOS to caller (result of f)

Keep in mind, that this is CPython specific. The stack does not contain the actual values though, it keeps references to those objects.

¹: Source

Does Python have a stack/heap and how is memory managed?

2 Answers2

Linked

Related