4

Python documentation for id() function states the following:

This is an integer which is guaranteed to be unique and constant for this object during its lifetime. Two objects with non-overlapping lifetimes may have the same id() value.

CPython implementation detail: This is the address of the object in memory.

Although, the snippet below shows that id's are repeated. Since I didn't explicitly del the objects, I presume they are all alive and unique (I do not know what non-overlapping means).

>>> g = [0, 1, 0]
>>> for h in g:
...     print(h, id(h))
... 
0 10915712
1 10915744
0 10915712
>>> a=0
>>> b=1
>>> c=0
>>> d=[a, b,c]
>>> for e in d:
...     print(e, id(e))
... 
0 10915712
1 10915744
0 10915712
>>> id(a)
10915712
>>> id(b)
10915744
>>> id(c)
10915712
>>>

How can the id values for different objects be the same? Is it so because the value 0 (object of class int) is a constant and the interpreter/C compiler optimizes?

If I were to do a = c, then I understand c to have the same id as a since c would just be a reference to a (alias). I expected the objects a and c to have different id values otherwise, but, as shown above, they have the same values.

What's happening? Or am I looking at this the wrong way?

I would expect the id's for user-defined class' objects to ALWAYS be unique even if they have the exact same member values.

Could someone explain this behavior? (I looked at the other questions that ask uses of id(), but they steer in other directions)

EDIT (09/30/2019):

TO extend what I already wrote, I ran python interpreters in separate terminals and checked the id's for 0 on all of them, they were exactly the same (for the same interpreter); multiple instances of different interpreters had the same id for 0. Python2 vs Python3 had different values, but the same Python2 interpreter had same id values.

My question is because the id()'s documentation doesn't state any such optimizations, which seems misleading (I don't expect every quirk to be noted, but some note alongside the CPython note would be nice)...

EDIT 2 (09/30/2019):

The question is stemmed in understanding this behavior and knowing if there are any hooks to optimize user-define classes in a similar way (by modifying the __equals__ method to identify if two objects are same; perhaps the would point to the same address in memory i.e. same id? OR use some metaclass properties)

rite2hhh
  • 372
  • 2
  • 15
  • 1
    Some basic objects like the integers from (if I remember correctly) -1 to 100 are created once at startup and reused where they are assigned. This means that you use _the same object_ in multiple places. – Michael Butscher Oct 01 '19 at 03:39
  • 1
    "I would expect the `id`'s for user-defined class' objects to **ALWAYS** be unique even if they have the exact same member values." You're using numbers. You can't get any further from user-defined classes than numbers. – Joseph Sible-Reinstate Monica Oct 01 '19 at 03:41
  • Possible duplicate of [What is the id( ) function used for?](https://stackoverflow.com/questions/15667189/what-is-the-id-function-used-for) – lmiguelvargasf Oct 01 '19 at 03:41
  • 2
    Possible duplicate of ["is" operator behaves unexpectedly with integers](https://stackoverflow.com/questions/306313/is-operator-behaves-unexpectedly-with-integers) – zvone Oct 01 '19 at 03:43
  • I'm curious if there's a fuller answer somewhere, e.g. that mentions pypy quirks. – o11c Oct 01 '19 at 03:51
  • 1
    @MichaelButscher I think it was -5 to 256. But (as an implementation detail) this can change without notice. – gilch Oct 01 '19 at 03:52
  • Compiler optimisations are impressive. For example `"foobar" is "foo" + "bar"` will also return `True`. – Selcuk Oct 01 '19 at 04:11
  • @Selcuk see https://en.wikipedia.org/wiki/Constant_folding – gilch Oct 01 '19 at 04:13
  • @gilch Thanks for the link. Also see [string interning](https://en.wikipedia.org/wiki/String_interning). – Selcuk Oct 01 '19 at 04:14
  • @Selcuk your example used both actually. Folding for the `+`, and interning for the `is`. – gilch Oct 01 '19 at 04:16
  • @lmiguelvargasf and zvone I saw those answers, but they talk more about uses, I'm interested in how'what happens. Other's thanks for the help – rite2hhh Oct 01 '19 at 04:42
  • @JosephSible, what I meant was, if I have the following definition `class MyClass: pass`, and create objects for that class, I would expect each instance to be unique – rite2hhh Oct 01 '19 at 04:56

1 Answers1

6

Ids are guaranteed to be unique for the lifetime of the object. If an object gets deleted, a new object can acquire the same id. CPython will delete items immediately when their refcount drops to zero. The garbage collector is only needed to break up reference cycles.

CPython may also cache and re-use certain immutable objects like small integers and strings defined by literals that are valid identifiers. This is an implementation detail that you should not rely upon. It is generally considered improper to use is checks on such objects.

There are certain exceptions to this rule, for example, using an is check on possibly-interned strings as an optimization before comparing them with the normal == operator is fine. The dict builtin uses this strategy for lookups to make them faster for identifiers.

a is b or a == b  # This is OK

If the string happens to be interned, then the above can return true with a simple id comparison instead of a slower character-by-character comparison, but it still returns true if and only if a == b (because if a is b then a == b must also be true). However, a good implementation of .__eq__() would already do an is check internally, so at best you would only avoid the overhead of calling the .__eq__().


Thanks for the answer, would you elaborate around the uniqueness for user-defined objects, are they always unique?

The id of any object (be it user-defined or not) is unique for the lifetime of the object. It's important to distinguish objects from variables. It's possible to have two or more variables refer to the same object.

>>> a = object()
>>> b = a
>>> c = object()
>>> a is b
True
>>> a is c
False

Caching optimizations mean that you are not always guaranteed to get a new object in cases where one might naiively think one should, but this does not in any way violate the uniqueness guarantee of IDs. Builtin types like int and str may have some caching optimizations, but they follow exactly the same rules: If they are live at the same time, and their IDs are the same, then they are the same object.

Caching is not unique to builtin types. You can implement caching for your own objects.

>>> def the_one(it=object()):
...     return it
...
>>> the_one() is the_one()
True

Even user-defined classes can cache instances. For example, this class only makes one instance of itself.

>>> class TheOne:
...     _the_one = None
...     def __new__(cls):
...         if not cls._the_one:
...             cls._the_one = super().__new__(cls)
...         return cls._the_one
...
>>> TheOne() is TheOne()  # There can be only one TheOne.
True
>>> id(TheOne()) == id(TheOne())  # This is what an is-check does.
True

Note that each construction expression evaluates to an object with the same id as the other. But this id is unique to the object. Both expressions reference the same object, so of course they have the same id.

The above class only keeps one instance, but you could also cache some other number. Perhaps recently used instances, or those configured in a way you expect to be common (as ints do), etc.

gilch
  • 10,813
  • 1
  • 23
  • 28