9

I'm working in Python 2.7 and I fond that issue that puzzling me.

That is the simplest example:

>>> class A(object):
    def __del__(self):
        print("DEL")
    def a(self):
        pass

>>> a = A()
>>> del a
DEL

That is OK like expected... now I'm trying to change the a() method of object a and what happen is that after change it I can't delete a any more:

>>> a = A()
>>> a.a = a.a
>>> del a

Just to do some checks I've print the a.a reference before and after the assignment

>>> a = A()
>>> print a.a
<bound method A.a of <__main__.A object at 0xe86110>>
>>> a.a = a.a
>>> print a.a
<bound method A.a of <__main__.A object at 0xe86110>>

Finally I used objgraph module to try to understand why the object is not released:

>>> b = A()
>>> import objgraph
>>> objgraph.show_backrefs([b], filename='pre-backref-graph.png')

pre-backref-graph.png

>>> b.a = b.a
>>> objgraph.show_backrefs([b], filename='post-backref-graph.png')

post-backref-graph.png

As you can see in the post-backref-graph.png image there is a __self__ references in b that have no sense for me because the self references of instance method should be ignored (as was before the assignment).

Somebody can explain why that behaviour and how can I work around it?

simonzack
  • 19,729
  • 13
  • 73
  • 118
Michele d'Amico
  • 22,111
  • 8
  • 69
  • 76

3 Answers3

5

When you write a.a, it effectively runs:

A.a.__get__(a, A)

because you are not accessing a pre-bound method but the class' method that is being bound at runtime.

When you do

a.a = a.a

you effectively "cache" the act of binding the method. As the bound method has a reference to the object (obviously, as it has to pass self to the function) this creates a circular reference.


So I'm modelling your problem like:

class A(object):
    def __del__(self):
        print("DEL")
    def a(self):
        pass

def log_all_calls(function):
    def inner(*args, **kwargs):
        print("Calling {}".format(function))

        try:
            return function(*args, **kwargs)
        finally:
            print("Called {}".format(function))

    return inner

a = A()
a.a = log_all_calls(a.a)

a.a()

You can use weak references to bind on demand inside log_all_calls like:

import weakref

class A(object):
    def __del__(self):
        print("DEL")
    def a(self):
        pass

def log_all_calls_weakmethod(method):
    cls = method.im_class
    func = method.im_func
    instance_ref = weakref.ref(method.im_self)
    del method

    def inner(*args, **kwargs):
        instance = instance_ref()

        if instance is None:
            raise ValueError("Cannot call weak decorator with dead instance")

        function = func.__get__(instance, cls)

        print("Calling {}".format(function))

        try:
            return function(*args, **kwargs)
        finally:
            print("Called {}".format(function))

    return inner

a = A()
a.a = log_all_calls_weakmethod(a.a)

a.a()

This is really ugly, so I would rather extract it out to make a weakmethod decorator:

import weakref

def weakmethod(method):
    cls = method.im_class
    func = method.im_func
    instance_ref = weakref.ref(method.im_self)
    del method

    def inner(*args, **kwargs):
        instance = instance_ref()

        if instance is None:
            raise ValueError("Cannot call weak method with dead instance")

        return func.__get__(instance, cls)(*args, **kwargs)

    return inner

class A(object):
    def __del__(self):
        print("DEL")
    def a(self):
        pass

def log_all_calls(function):
    def inner(*args, **kwargs):
        print("Calling {}".format(function))

        try:
            return function(*args, **kwargs)
        finally:
            print("Called {}".format(function))

    return inner

a = A()
a.a = log_all_calls(weakmethod(a.a))

a.a()

Done!


FWIW, not only does Python 3.4 not have these issues, it also has WeakMethod pre-built for you.

Veedrac
  • 58,273
  • 15
  • 112
  • 169
  • Ok... There is a way to avoid that? I should cache some methods and recover the method later: is that possible? – Michele d'Amico Oct 02 '14 at 09:49
  • It depends what you're trying to do. – Veedrac Oct 02 '14 at 09:49
  • OK I found the solution: a.a = types.MethodType(A.a,a,A) – Michele d'Amico Oct 02 '14 at 09:54
  • I need to cache a method while I'm testing the class and then recover it when the test is done. – Michele d'Amico Oct 02 '14 at 11:02
  • I don't understand what you mean. – Veedrac Oct 02 '14 at 11:04
  • When I'm writing test unit some times I need start to trace some methods call. To do it I use a functor() that decorate the function and log every calls. But instead of apply the functor as decorator in the class I use it on the object as function `a.a = functor(a.a)`. When the test is done I would like replace the functor by the original method. I other words I would like recover the original a object and not the decorated one. – Michele d'Amico Oct 02 '14 at 11:36
  • Can't you just assign to the class' method (`A.a = f(A.a)`)? – Veedrac Oct 02 '14 at 11:38
  • Let us [continue this discussion in chat](http://chat.stackoverflow.com/rooms/62331/discussion-between-michele-damico-and-veedrac). – Michele d'Amico Oct 02 '14 at 11:41
  • Late to the party here, but I found this post very helpful and I have one question for @Veedrac, who i hope is still around to answer it... In what sense does "Python 3.4 not have these issues"? I tried the a.a = a.a thing on my system (python 3.7.3) and I do observe that it creates a reference cycle. – Michael Carilli Oct 15 '19 at 17:03
  • 1
    @MichaelCarilli Python 3.4 and upwards can garbage collect circular references containing objects with finalizers. https://www.python.org/dev/peps/pep-0442/. It's still a reference cycle, but it doesn't cause a leak. – Veedrac Oct 15 '19 at 17:34
  • @Veedrac Thank you for the quick response! – Michael Carilli Oct 16 '19 at 16:26
  • @Veedrac I have one more question, since this is what I really need to know for my application: Post 3.4, will such objects with reference cycles be automatically garbage collected, or must one call gc.collect() manually? – Michael Carilli Oct 16 '19 at 16:43
  • 1
    @MichaelCarilli Automatically, at some unspecified future time. – Veedrac Oct 16 '19 at 23:07
4

Veedrac's answer about the bound method keeping a reference to the instance is only part of the answer. CPython's garbage collector knows how to detect and handle cyclic references - except when some object that's part of the cycle has a __del__ method, as mentioned here https://docs.python.org/2/library/gc.html#gc.garbage :

Objects that have __del__() methods and are part of a reference cycle cause the entire reference cycle to be uncollectable, including objects not necessarily in the cycle but reachable only from it. Python doesn’t collect such cycles automatically because, in general, it isn’t possible for Python to guess a safe order in which to run the __del__() methods. (...) It’s generally better to avoid the issue by not creating cycles containing objects with __del__() methods, and garbage can be examined in that case to verify that no such cycles are being created.

IOW : remove your __del__ method and you should be fine.

EDIT: wrt/ your comment :

I use it on the object as function a.a = functor(a.a). When the test is done I would like replace the functor by the original method.

Then the solution is plain and simple:

a = A()
a.a = functor(a.a)
test(a)
del a.a

Until you explicitely bind it, a has no 'a' instance atribute, so it's looked up on the class and a new method instance is returned (cf https://wiki.python.org/moin/FromFunctionToMethod for more on this). This method instance is then called, and (usually) discarded.

bruno desthuilliers
  • 75,974
  • 6
  • 88
  • 118
  • That not a solution.... I need the __del__ method (the reasons of that are out of scope here) and that issue is the only one that I don't know how to solve for the other weak references works well – Michele d'Amico Oct 02 '14 at 10:51
  • 1
    Well, it actually answers your question "Somebody can explain why that behaviour and how can I work around it?" - you didn't mentionned you needed `__del__` and your snippet doesn't imply it's of any real use ;) Now if you look at the link I pointed you too, there's a bit more on the topic... – bruno desthuilliers Oct 02 '14 at 11:03
  • As write in the link you posted the best way Is don't create cycle and what I asked was "I don't understand why that cycle born and if there is a way to don't create that kind cycles" ... but remove __del__ don't remove cycle, but just the cycle side effects :) – Michele d'Amico Oct 02 '14 at 11:30
  • Uhu, right - but the cycle is not a problem in itself, only when it cannot be garbage-collected - and _this_ comes from having a `__del__` method. Without that method you don't have to care much about avoiding cyclic references... But anyway: I mostly posted this because the first answer was incomplete IMHO. – bruno desthuilliers Oct 02 '14 at 12:04
  • Cycles can be a problem from a performance point of view (as pointed Alex Martelli in [that post](http://stackoverflow.com/questions/1507566/how-and-when-to-appropriately-use-weakref-in-python) ). Weak references will be used to increase the performance even when there aren't any __del__ override. If you use weakref or you have not cycle you don't need to of the garbage collector to free the resources. – Michele d'Amico Oct 02 '14 at 12:39
  • Ok ... that's good... but I would like to recover the original `a.a` method, use it and don't have any problems about memory leaks. That is the issue: assign a method to a object create a circular reference that you must break my something like `del a.a` or `a.a=None`. – Michele d'Amico Oct 02 '14 at 13:03
  • There's _no_ "original `a.a` method" cf https://wiki.python.org/moin/FromFunctionToMethod – bruno desthuilliers Oct 02 '14 at 14:00
1

As to why Python does this. Technically all objects contain circular references if they have methods. However, garbage collection would take much longer if the garbage collector had to do explicit checks on an objects methods to make sure freeing the object wouldn't cause a problem. As such Python stores the methods separately from an object's __dict__. So when you write a.a = a.a, you are shadowing the method with itself in the a field on the object. And thus, there is an explicit reference to the method which prevents the object from being freed properly.

The solution to your problem is not bother to keep a "cache" of the original method and just delete the shadowed variable when you're done with it. This will unshadow the method and make it available again.

>>> class A(object):
...     def __del__(self):
...         print("del")
...     def method(self):
...         print("method")
>>> a = A()
>>> vars(a)
{}
>>> "method" in dir(a)
True
>>> a.method = a.method
>>> vars(a)
{'method': <bound method A.method of <__main__.A object at 0x0000000001F07940>>}
>>> "method" in dir(a)
True
>>> a.method()
method
>>> del a.method
>>> vars(a)
{}
>>> "method" in dir(a)
True
>>> a.method()
method
>>> del a
del

Here vars shows what's in the __dict__ attribute of an object. Note how __dict__ doesn't contain a reference to itself even though a.__dict__ is valid. dir produces a list of all the attributes reachable from the given object. Here we can see all the attributes and methods on an object and all the methods and attributes of its classes and their bases. This shows that the bound method of a is stored in place separate to where a's attributes are stored.

Dunes
  • 37,291
  • 7
  • 81
  • 97
  • Thanks for your answer but the real question isn't "why the object was not destroyed?" but why `a.a=a.a` create a circular reference. If you remove the `__del__` override and try to plot the graphs before and after the assignment you will find that the graphs are different and the second one has a circular reference. – Michele d'Amico Oct 03 '14 at 13:50
  • Misunderstood the question. I've completely changed my answer, with a new recommendation and learnt something myself in the process. – Dunes Oct 03 '14 at 16:13
  • THX: that is exactly what I was meaning with "Why set a bound method to python object create a circular reference?". Veedrac got me a way to work with some issue related to that but with your answer I've really understand. – Michele d'Amico Oct 03 '14 at 20:36
  • @Dunes : "As such Python stores the methods separately from an object's __dict__ (...) This shows that the bound method of a is stored in place separate" : that's even simpler : Python _does not_ store methods. At all. What it stored is the function (in the class's `__dict__`, how the (bound or unbound) method object is created _at lookup time_ is explained here : https://wiki.python.org/moin/FromFunctionToMethod – bruno desthuilliers Oct 06 '14 at 11:26
  • @brunodesthuilliers That was very informative, thanks. I feel that I should maybe remove my answer as it seems to be a less informed duplicate of yours. – Dunes Oct 08 '14 at 10:06