Weird behaviour of id function in cpython

Question

I did the following:

>>> a=10
>>> id(a)
31817408L
>>>
>>> id(10)
31817408L

So, we can see that id(a) equals id(10)

Now,i do

>>>a='what is this'
>>> id(a)
   35412416L
>>>
>>>
>>>
>>> id('what is this')
   31951968L

why in this case id(a) not equal to id('what is this')? What is actually happening behind the scenes?

Other related question: [Types for which “is” keyword may be equivalent to equality operator in Python](http://stackoverflow.com/q/3218308/364696) — ShadowRanger, Sep 07 '16 at 04:03

score 2 · Answer 1 · answered Sep 07 '16 at 03:25

2

Different IDs mean different addresses in memory, so your two 'what is this' strings are truly two strings, even though they store the same value. On the other hand, Python optimizes the frequently-used integers so that all the occurrences point to the same object in memory. And fortunately, that object is immutable, so you can't say 10=9. If you choose an infrequently-used integer, you can see what's going on:

>>> a=555555
>>> id(a)
44506456L
>>> id(555555)
44506528L

answered Sep 07 '16 at 03:25

D-Von

416
2
5

Please also explain this in your answer: https://paste.fedoraproject.org/423197/ – Nehal J Wani Sep 07 '16 at 03:33
@NehalJWani, I don't follow you. That's exactly the same phenomenon, but with frequently-used integer 123 instead of frequently-used integer 10. – D-Von Sep 07 '16 at 03:34
I am using strings in the paste (not numbers), and for the first case, the id is same for `a` and `b`, but for the second case, the id is not same for `a` and `b`. – Nehal J Wani Sep 07 '16 at 03:37
So my question is, why does python think that `'123'` is a frequently used string, but `'what is this'` is not? – Nehal J Wani Sep 07 '16 at 03:39
One more query: if i do a=555555 and b=555555, i was expecting that both will point to memory address of 555555 and thereby resulting the increase in refcount for 555555 by 1. So total refcount for 555555 is 2.Is that correct? But then how come memory address in above examples are different? – fsociety Sep 07 '16 at 03:43
@NehalJWani: Sorry, I missed the fact that you were using strings. I don't know what Python's criteria are for deciding which strings (or integers) are frequently used. You could look it up, or you could experiment and see. But it doesn't really matter. As long as the objects in question are immutable, Python is entitled to optimize the references to them in any way it wants. It can change its scheme across versions of Python or across phases of the moon or whatever. – D-Von Sep 07 '16 at 03:48
@fsociety: As far as I know, Python has only two optimization (or lack-of-optimization) patterns: If the integer belongs to a hardwired set of frequently-used integers, Python optimizes. If not, it doesn't. Apparently, 55555 is not in that set. So I would expect that every single instance of it would be stored at a different memory address. In principle, Python could notice that 555555 was suddenly being used a lot, and add 555555 to the set, but I doubt it does that. To explore a little more, try `a=555555; b=a; id(a), id(b)`. – D-Von Sep 07 '16 at 03:52

Weird behaviour of id function in cpython

1 Answers1