From document:
A Python program is constructed from code blocks. A block is a piece
of Python program text that is executed as a unit. The following are
blocks: a module, a function body, and a class definition. Each
command typed interactively is a block. A script file (a file given as
standard input to the interpreter or specified as a command line
argument to the interpreter) is a code block. A script command (a
command specified on the interpreter command line with the -c option)
is a code block. A module run as a top level script (as module
__main__
) from the command line using a -m argument is also a code block. The string argument passed to the built-in functions eval() and
exec() is a code block.
That's because your first code is inside a code block(a module) and it executed as a unit. But in interactive shell , when you execute them in two different commands, they are in different code blocks.
Python can re-use the reference of some immutable types like tuple inside a code block. That's just an optimization not a bug.
Let's examine it with functions(remember a function body is also a code block) and integers bigger than 256 this time:
# inside a python file
def fn1():
a = 1000
b = 1000
print("id of 'a'", id(a))
print("id of 'b'", id(b))
def fn2():
c = 1000
print("id of 'c'", id(c))
fn1()
fn2()
# id of 'a' 1738701965680
# id of 'b' 1738701965680
# id of 'c' 1738701965680
Now:
# inside interactive mode
>>> def fn1():
... a = 1000
... b = 1000
... print("id of 'a'", id(a))
... print("id of 'b'", id(b))
...
>>> def fn2():
... c = 1000
... print("id of 'c'", id(c))
...
>>> fn1()
id of 'a' 2441294813616
id of 'b' 2441294813616
>>> fn2()
id of 'c' 2441294813264
>>>
When fn1
and fn2
are inside a module, they are executed inside a code block but in the second one, they are not in a module, they executed separately. In fn1
however a
and b
point to the same object.
Answer to comment :
Why does Python use the memory addresses of immutable objects as their
IDs rather than hashing their content so that the same value always
yield the same ID?
Take a look at this answer, particularly when Martjin said "3. The code object is not referenced by anything, reference count drops to 0 and the code object is deleted. As a consequence, so is the string object.". So I can say that in REPL if you want to have the same ID for the string, the memory of the code object should not get freed. I think this is the main downside to this and why core developers didn't decide that way.