10

In the following code, I don't understand why useless_func has the same id when it belongs to two different objects?

class parent(object):
   @classmethod
   def a_class_method(cls):
     print "in class method %s" % cls

   @staticmethod
   def a_static_method():
     print "static method"

   def useless_func(self):
     pass

 p1, p2 = parent(),parent()

 id(p1) == id(p2) // False

 id(p1.useless_func) == id(p2.useless_func) // True
smci
  • 32,567
  • 20
  • 113
  • 146
iamkhush
  • 2,562
  • 3
  • 20
  • 34

2 Answers2

13

This is a very interesting question!

Under your conditions, they do appear the same:

Python 2.7.2 (default, Oct 11 2012, 20:14:37) 
[GCC 4.2.1 Compatible Apple Clang 4.0 (tags/Apple/clang-418.0.60)] on darwin
Type "help", "copyright", "credits" or "license" for more information.
>>> class Foo(object):
...   def method(self): pass
... 
>>> a, b = Foo(), Foo()
>>> a.method == b.method
False
>>> id(a.method), id(b.method)
(4547151904, 4547151904)

However, notice that once you do anything with them, they become different:

>>> a_m = a.method
>>> b_m = b.method
>>> id(a_m), id(b_m)
(4547151*9*04, 4547151*5*04)

And then, when tested again, they have changed again!

>>> id(b.method)
4547304416
>>> id(a.method)
4547304416

When a method on an instance is accessed, an instance of "bound method" is returned. A bound method stores a reference to both the instance and to the method's function object:

>>> a_m
<bound method Foo.method of <__main__.Foo object at 0x10f0e9a90>>
>>> a_m.im_func is Foo.__dict__['method']
True
>>> a_m.im_self is a
True

(note that I need to use Foo.__dict__['method'], not Foo.method, because Foo.method will yield an "unbound method"… the purpose of which is left as an exercise to the reader)

The purpose of this "bound method" object is to make methods "behave sensibly" when they are passed around like functions. For example, when I call function a_m(), that is identical to calling a.method(), even though we don't have an explicit reference to a any more. Contrast this behaviour with JavaScript (for example), where var method = foo.method; method() does not produce the same result as foo.method().

SO! This brings us back to the initial question: why does it seem that id(a.method) yields the same value as id(b.method)? I believe that Asad is correct: it has to do with Python's reference-counting garbage collector*: when the expression id(a.method) is evaluated, a bound method is allocated, the ID is computed, and the bound method is deallocated. When the next bound method — for b.method — is allocated, it is allocated to exactly the same location in memory, since there haven't been any (or have been a balanced number of) allocations since the bound method for a.method was allocated. This means that a.method appears to have the same memory location as b.method.

Finally, this explains why the memory locations appear to change the second time they are checked: the other allocations which have taken place between the first and the second check mean that, the second time, they are allocated at a different location (note: they are re-allocated because all references to them were lost; bound methods are cached†, so accessing the same method twice will return the same instance: a_m0 = a.method; a_m1 = a.method; a_m0 is a_m1 => True).

*: pedants note: actually, this has nothing to do with the actual garbage collector, which only exists to deal with circular references… but… that's a story for another day.
†: at least in CPython 2.7; CPython 2.6 doesn't seem to cache bound methods, which would lead me to expect that the behaviour isn't specified.

David Wolever
  • 148,955
  • 89
  • 346
  • 502
  • 2
    I believe in your second examples, you've got two different references, but a.method and b.method still have the same id. – Hamish May 24 '13 at 04:19
  • 1
    @Hamish That's not how id works: `a = [] b = a id(a) == id(b) True` – Patashu May 24 '13 at 04:20
  • You were talking about bound methods earlier — so why not mention them now, since that’s what half of this is? – Ry- May 24 '13 at 04:22
  • @David Wolever Can you please explain why the stuff in your answer happens? I think it is because that assigning a method to a variable name actually binds and creates a copy of the method at a new location, so that when called it will know what object it belonged to. Is that correct? – Patashu May 24 '13 at 04:27
  • Ah, I see you beat me to the idea of checking the same methods after scrambling up the memory a bit. – Asad Saeeduddin May 24 '13 at 04:31
  • Ok! I think that's a pretty decent explanation. – David Wolever May 24 '13 at 04:39
  • 2
    `lst = [1,2,3]; len({id(lst2.append) for _ in range(1000000)})` actally yields different results for different runs (usually between 1-3)...should look at the source to see how it's handled exactly... – root May 24 '13 at 04:57
  • @root it is consistently 1 for me (py 2.7)… Which version of Python are you using? And can you recreate that when running from a script as opposed to running from the interactive prompt? As for the nondeterminism: my first guess is that would be caused by the garbage collector, which runs (IIRC) after every N function calls. – David Wolever May 24 '13 at 05:31
  • This only appears happening in the interactive interpreter(where I tested it at first)...(running python 2.7 on Ubuntu). As for the diagnosis, you are probably right, it would be nice to take a loot what magic is going on under the hood (it seems to be doing something more advance than simply counting the f calls though) :) – root May 24 '13 at 05:46
  • 1
    Actually, I only seems to be inconsistent when running on the IPython console...(that I think is making some other calls behind the scenes?) – root May 24 '13 at 06:00
  • 1
    @DavidWolever Bound methods are cached, so accessing the same method twice will return the same instance `a_m0 = a.method; a_m1 = a.method; a_m0 is a_m1 => True` . This is not necessarily true. I am on 2.6.5 (r265:79063, Apr 16 2010, 13:57:41) [GCC 4.4.3]. – Ankur Agarwal Aug 14 '13 at 05:42
9

Here is what I think is happening:

  1. When you dereference p1.useless_func, a copy of it is created in memory. This memory location is returned by id
  2. Since there are no references to the copy of the method just created, it gets reclaimed by the GC, and the memory address is available again
  3. When you dereference p2.useless_func, a copy of it is created in the same memory address (it was available), which you retrieve using id again.
  4. The second copy is GCd

If you were to run a bunch of other code and check the ids of the instance methods again, I'll bet the ids would be identical to each other, but different from the original run.

Additionally, you might notice that in David Wolver's example, as soon as a lasting reference to the method copy is obtained the ids become different.

To confirm this theory, here is a shell session using Jython (same result with PyPy), which does not utilize CPython's reference counting garbage collection:

Jython 2.5.2 (Debian:hg/91332231a448, Jun 3 2012, 09:02:34) 
[OpenJDK Server VM (Oracle Corporation)] on java1.7.0_21
Type "help", "copyright", "credits" or "license" for more information.
>>> class parent(object):
...     def m(self):
...             pass
... 
>>> p1, p2 = parent(), parent()
>>> id(p1.m) == id(p2.m)
False
jamylak
  • 128,818
  • 30
  • 231
  • 230
Asad Saeeduddin
  • 46,193
  • 6
  • 90
  • 139