-1

Why trying to explain here on stackoverflow what the Python command id() does and how can it be used to reveal how Python works under the hood I had run into following strange behavior I am struggling to understand how it comes:

TERMINAL-PROMPT $ python3.9 -V
Python 3.9.13
TERMINAL-PROMPT $ cat       "list_and_int_1.py"
i = 12345
assert id(i) == id( 12345 )
TERMINAL-PROMPT $ python3.9 "list_and_int_1.py"
TERMINAL-PROMPT $ python3.9
Python 3.9.13 (main, May 20 2022, 21:21:14) 
[GCC 5.4.1 20160904] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> i = 12345
>>> assert id(i) == id( 12345 )
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
AssertionError

So my question is why there is an AssertionError if I run the Python script code from within Python prompt but not if running the code executing it from file?

Seeing my question closed without trying to pay attention to what it is about I suggest: PLEASE read the question.

I am fully aware of the issue with cached integers. So my question is NOT about that. My question is why in the Python prompt there is another behavior of same code compared to running that code from the file? The Python version is exactly the same. So how does it come that it gives different results depending from where it is run?

My question is definitely NOT a duplicate of: https://stackoverflow.com/questions/306313/is-operator-behaves-unexpectedly-with-integers .

And ... YES, I was aware of the possibility that I don't see the assertion error messages when running the file so I have checked this out and changed the assert statement in the file so that it will raise AssertionError to see if the Error shows up in the console and it did show up excluding this kind of issue from the possible reasons.

TERMINAL-PROMPT $ cat       "list_and_int.py"
i = 12345
assert id(i) == id( 12345 )
TERMINAL-PROMPT $ python3.9 "list_and_int.py"
TERMINAL-PROMPT $ cat       "list_and_int.py"
i = 12345
assert id(i) != id( 12345 )
TERMINAL-PROMPT $ python3.9 "list_and_int.py"
Traceback (most recent call last):
  File "/home/.../list_and_int.py", line 2, in <module>
    assert id(i) != id( 12345 )
AssertionError

P.S. I have seen the urge to close questions here on stackoverflow many times being sometimes myself not able to post an answer because the question was closed very fast and the reason given for closing it was a wrong one. I understand the time pressure of the modern world ... but ... this is not an excuse for not spending a short thought of why actually the question is there and what is it really about.

P.S. P.S. Source: https://docs.python.org/3.10/library/functions.html

id(object)
Return the “identity” of an object. 
This is an integer which is guaranteed to be unique and constant
for this object during its lifetime. Two objects with 
non-overlapping lifetimes may have the same id() value.
    

CPython implementation detail: This is the address of the object in memory.

So is this documentation WRONG? Does the id() not give an insight of different memory area if the id()s are different? Is the same id() possible when the objects are stored at different positions in memory?

Claudio
  • 7,474
  • 3
  • 18
  • 48
  • 3
    The answer you are looking for is here: ["is" operator behaves unexpectedly with integers](https://stackoverflow.com/questions/306313/is-operator-behaves-unexpectedly-with-integers) – Marco Bonelli Aug 31 '22 at 01:43
  • 1
    The details aren't relevant. The underlying point is that `is` should not be used to compare integers. – Karl Knechtel Aug 31 '22 at 01:55
  • No, that one is also a duplicate (thanks for highlighting it). That kind of fine distinction is not useful, and it herds future site-searchers away from the best quality information. The linked duplicate is canonical, and the answers cover the relevant material: the caching for small integers is an implementation detail; the behaviour is not defined in general; and there is *no good* reason to expect `is` on numerically equal integers to return either `True` or `False` without considerable, conscious legwork that is almost certainly a bad idea anyway. Just don't make these comparisons. – Karl Knechtel Aug 31 '22 at 02:02
  • 2
    Although we do have [What's with the integer cache maintained by the interpreter?](https://stackoverflow.com/questions/15171695/) which is perhaps a bit more precise. – Karl Knechtel Aug 31 '22 at 02:05
  • My pledge to all: **PLEASE** read the question. I am fully aware of the cached integers. So my question is *NOT* about that. My question is why in the Python prompt there is another behavior of same code compared to running that code from the file? – Claudio Aug 31 '22 at 02:44
  • @Claudio do you have a `PYTHONOPTIMIZE` environment variable set? That would prevent assert statements from executing when executing python files through the terminal/command-line. – Paul M. Aug 31 '22 at 03:14
  • @PaulM. I was aware of that and have checked this out and changed the assert statement so that it will raise AssertionError to see if it shows up in the console and it did show up. – Claudio Aug 31 '22 at 03:17
  • @PaulM. printenv does not list `PYTHONOPTIMIZE` environment variable. Answers this your question about it or should I use another tool/way to check it out? – Claudio Aug 31 '22 at 03:27
  • By the way: have someone of you tried to reproduce that behavior I am speaking about in the question? With which result? – Claudio Aug 31 '22 at 03:31
  • *The details aren't relevant. The underlying point is that is should not be used to compare integers. – Karl Knechtel* is an excellent example of missing the point. My question is not about comparison of integers - but about different behavior of same code depending on how it is run: from file or from Python prompt. *irrelevant* in this context is that it is the `assert`i and `id()` command involved. It could possibly be another ones, but I have experienced it just with these. That's all. – Claudio Aug 31 '22 at 03:54
  • @KarlKnechtel This is not about the integer cache, but about code objects' `co_consts` being reused. See my answer... – AKX Sep 01 '22 at 14:15
  • 1
    "but about different behavior of same code depending on how it is run: from file or from Python prompt. " My point is that *there is no practical reason to care about this difference, because there is no practical reason for writing code that would care about the difference, and trying to do it is a good way to introduce bugs.* – Karl Knechtel Sep 01 '22 at 19:59
  • @KarlKnechtel : it all turned out not to be about the caching mechanisms of Python. It turned out to be about the optimizations used by the compiler which are possible for a large amount of code if compiled at once, but not possible if compiled piece by piece. You seem to be not aware of the fact, that this revelation has very deep going consequences and side-effects you are better aware of if you are optimizing your applications and usage of memory. Compiling two files separately and running them one after another could possibly not fit into memory where compiling them from one file would. – Claudio Sep 01 '22 at 20:28
  • 2
    "the optimizations used by the compiler which are possible for a large amount of code if compiled at once" - there is no such thing. What we have here is a very small, local optimization, of the same kind as with string interning. These are only implementation details, subject to change and in no way part of the definition of the language. If you were to have to rely on such details, well, Python probably isn't the language you need for that application. – Thierry Lathuille Sep 02 '22 at 07:43

1 Answers1

2

The people in the comments are correct – don't use is to compare anything unless you really know why you'd do that.

Anyway, the answer is that CPython optimizes constants when it compiles code objects.

The disassembly for the module

a = 123456
b = 234567
c = 345678
d = 123456

is

  1           0 LOAD_CONST               0 (123456)
              2 STORE_NAME               0 (a)

  2           4 LOAD_CONST               1 (234567)
              6 STORE_NAME               1 (b)

  3           8 LOAD_CONST               2 (345678)
             10 STORE_NAME               2 (c)

  4          12 LOAD_CONST               0 (123456)
             14 STORE_NAME               3 (d)
             16 LOAD_CONST               3 (None)
             18 RETURN_VALUE

As you can see, the constant 123456 from "constant slot" 0 is used twice when assigning to a name.

In the REPL, each line (well, each entry, since you know you can enter full suites in the REPL too) is compiled separately, so the same constant can't be reused.

You can inspect the constants for a compiled code object via co_consts.

>>> source = """
... a = 123456
... b = 234567
... c = 345678
... d = 123456
... """
>>> code = compile(source, "<>", "exec")
>>> code.co_consts
(123456, 234567, 345678, None)
AKX
  • 152,115
  • 15
  • 115
  • 172
  • Now, knowing the answer, I am wondering why I didn't have the idea to try in the REPL these two lines of code as `i = 12345 ; assert id(i) == id( 12345 )` which doesn't raise an `AssertionError`. In other words your answer dissolved my confusion. Thanks for the attention and effort to give an answer. – Claudio Sep 01 '22 at 14:46