1

If I use the below:

a = 1000
print(id(a))

myList = [a,2000,3000,4000]
print(id(myList[0]))

# prints the same IDs

I get the same id. This makes sense to me. I can understand how the memory manager could assign the same object to these variables, because I am directly referencing a in the list.

However, if I do this instead:

a = 1000
print(id(a))

myList = [1000,2000,3000,4000]
print(id(myList[0]))

# prints the same IDs

I STILL get the same id being output for both prints. How does Python know to use the same object for these assignments? Searching for pre-existence would surely be hugely inefficient so I am presuming something more clever is going on here.

My first thought was something to do with the integer itself being used to calculate the objects address, but the behaviour also holds true for strings:

a = "car"
print(id(a))

myList = ["car",2000,3000,4000]
print(id(myList[0]))

# prints the same IDs

The behaviour does NOT however, hold true for list elements:

a = [1,2,3]
print(id(a))

myList = [[1,2,3],2000,3000,4000]
print(id(myList[0]))

# prints different IDs

Can someone explain the behaviour I am seeing?

EDIT - I have encountered that for small values between -5 and 256, the same object may be used. The thing is that I am seeing the same object still being used even for huge values, or even strings:

a = 1000000000000
myList = [1000000000000,1000,2000]
print(a is myList[0])
# outputs True!

My question is How can Python work out that it is the same object in these cases without searching for pre-existence? Let's say CPython specifically

EDIT - I am using Python V3.8.10

SuperHanz98
  • 2,090
  • 2
  • 16
  • 33
  • Does this answer your question? ["is" operator behaves unexpectedly with integers](https://stackoverflow.com/questions/306313/is-operator-behaves-unexpectedly-with-integers) – Homer512 Jan 30 '23 at 13:56
  • @Homer512 not particularly The same object is used here even if the number is huge, not just between -5 and 256 – SuperHanz98 Jan 30 '23 at 13:58
  • for immutable object it store them in heap and map the variable to those heap object, if any variable want to set same value then from heap it give back the same address to the variable if that value in heap , and thus variable gives same id – sahasrara62 Jan 30 '23 at 14:03
  • Can you also update your question with the python version you are using? – Abdul Niyas P M Jan 30 '23 at 14:06
  • @sahasrara62 yes, I understand that... My question is HOW? How can it know the object already exists to give it the same address/id? – SuperHanz98 Jan 30 '23 at 14:08
  • hash heap or heap + hashing – sahasrara62 Jan 30 '23 at 14:10
  • 1
    Maybe https://stackoverflow.com/q/34147515/12671057 – Kelly Bundy Jan 30 '23 at 14:11
  • Yep, I was wrong. Kelly points to the right answer. Python simply caches constants when compiling to byte code. For strings this is the [intern](https://docs.python.org/3.8/library/sys.html#sys.intern) mechanism. If you generate those dynamically, the ID will be different – Homer512 Jan 30 '23 at 14:15
  • These comments are all interesting and I appreciate the input, but my question is HOW does it achieve this? @sahasrara62 mentioned hash mapping. It would be good if someone could elaborate on this further. – SuperHanz98 Jan 30 '23 at 14:21
  • I will edit to specify CPython specifically – SuperHanz98 Jan 30 '23 at 14:22
  • Did you see the answers of the question I linked? Do you want more details than that? – Kelly Bundy Jan 30 '23 at 14:30
  • @chepner *"a list display always creates a new list object"* - [Almost always](https://tio.run/##K6gsycjPM7YoKPr/PzO3IL@oRCEls5iLC0joAbGGuoFCZp5CtKGOkY5xrLrm//8A). – Kelly Bundy Jan 30 '23 at 14:35
  • "How can Python work out that it is the same object in these cases without searching for pre-existence?" It **does** search for pre-existence. Python has a compile phase, translating the text to bytecode. It deduplicates strings, ints, and other compile time constants while doing so. And yes, it will almost certainly use a hash map for that, why would it use anything else? – Homer512 Jan 30 '23 at 21:21

1 Answers1

1

In Python, small and unchanging values like numbers and short strings are stored only once (unless operators are used for making them, this way a new object is created for that) in the computer's memory to save space and speed up the program. This process is called "interning". This means that when you write the same value multiple times, it will have the same memory address (id), and you will get the same id for each instance of that value. However, lists and other more complex data types are not interned, so every time you use a list, a new memory space is allocated for it, giving it a different id.

  • @KellyBundy When you change the value of a variable using an operation, such as y = -x, a new object is created to store the new value, even if it's the same as the original value. This new object will have a different memory address (id) from the original variable, so x and y have different ids even though their values are the same. – Alireza Saffarian Jan 30 '23 at 14:06
  • @KellyBundy This is because the operation `--x` creates a new object in memory, so even though it has the same value as `x`, it is stored in a different memory location and has a different id. In Python, when an operator is applied to an object, a new instance of that object may be created, resulting in a different id. – Alireza Saffarian Jan 30 '23 at 14:23
  • As I said, using operators ALWAYS makes a new object, although they have the same value. I changed my answer with more clarification. Thanks. – Alireza Saffarian Jan 30 '23 at 14:32
  • Interning is not universal or defined by Python. It's something that *CPython* (and other implementations) *can* do. – chepner Jan 30 '23 at 14:48