About the exec()
and its behavior with locals, there is already an open debate here: How does exec work with locals?.
Regarding the question, it seems practically impossible to test that by dynamically adding variables to the local namespace that is shared with function's __code__.co_varnames
. And the reason is that this is restricted to code that is byte-compiled together. This is the same behavior that functions like exec
and eval
are bounded to in other situations such as
executing codes contain private variables.
In [154]: class Foo:
...: def __init__(self):
...: __private_var = 100
...: exec("print(__private_var)")
In [155]: f = Foo()
---------------------------------------------------------------------------
NameError Traceback (most recent call last)
<ipython-input-155-79a961337674> in <module>()
----> 1 f = Foo()
<ipython-input-154-278c481fbd6e> in __init__(self)
2 def __init__(self):
3 __private_var = 100
----> 4 exec("print(__private_var)")
5
6
<string> in <module>()
NameError: name '__private_var' is not defined
Read https://stackoverflow.com/a/49208472/2867928 for more details.
However, this doesn't mean that we can't find out the limit in theory.i.e By analyzing the way that python stores the local variables in memory.
The way that we can do this is to first look at the bytecodes of a function and see how respective instructions are stored in memory. The dis
is a great tool for disassembling a Python code, which in case we can disassemble a simple function as following:
>>> # VERSIONS BEFORE PYTHON-3.6
>>> import dis
>>>
>>> def foo():
... a = 10
...
>>> dis.dis(foo)
2 0 LOAD_CONST 1 (10)
3 STORE_FAST 0 (a)
6 LOAD_CONST 0 (None)
9 RETURN_VALUE
Here the most left number is the number of line in which the code is stored. The column of numbers after it is the offsets of each instruction in the bytecode.
The STOR_FAST
opcode stores TOS (top of stack) into the local co_varnames[var_num]
. And since the difference of its offset with its next opcode is 3 (6 - 3) it means that each STOR_FAST
opcode only occupies 3 bytes of the memory. The first byte is to store the operation or byte code; the second two bytes are the operand for that byte code which means that there are 2^16 possible combinations.
Therefore, in one byte_compile, theoretically a function can only have 65536 local variables.
After Python-3.6 the Python interpreter now uses a 16-bit wordcode instead of bytecode. Which is actually aligning the instructions to always be 2 bytes rather than 1 or 3 by having arguments only take up 1 byte.
So if you do the disassembling in later versions you'll get the following result which still uses two bytes for STORE_FAST.:
>>> dis.dis(foo)
2 0 LOAD_CONST 1 (10)
2 STORE_FAST 0 (a)
4 LOAD_CONST 0 (None)
6 RETURN_VALUE
However, @Alex Hall showed in comment that you can exec
a whole function with more than 2^16 variables that makes them also available in __code__.co_varnames
. But still this doesn't mean that it's practically feasible to test the hypothesis (because if you try to test with powers more than 20 it'll get exponentially more and more time consuming). However, here is the code:
In [23]: code = '''
...: def foo():
...: %s
...: print('sum:', sum(locals().values()))
...: print('add:', var_100 + var_200)
...:
...: ''' % '\n'.join(f' var_{i} = {i}'
...: for i in range(2**17))
...:
...:
...:
In [24]: foo()
sum: 549755289600
add: 300
In [25]: len(foo.__code__.co_varnames)
Out[25]: 1048576
This means that although STORE_FAST
uses 2 bytes for preserving the TOS and "theoretically" can't preserve more than 2^16 different variables, there should be some other unique identifier, like the offset number, or extra space that makes it possible to preserve more than 2^16. And as it turned out it's EXTENDED_ARG
that as it's mentioned in documentation it prefixes any opcode which has an argument too big to fit into the default two bytes. Therefore it's 2^16 + 16 = 2^32.
EXTENDED_ARG(ext)¶
Prefixes any opcode which has an argument too big to fit into the default two bytes. ext holds two additional bytes which, taken
together with the subsequent opcode’s argument, comprise a four-byte
argument, ext being the two most-significant bytes.