3

I am trying to figure out when exactly a python object is a candidate for garbage collection. I have read through a few documents/posts and have been unable to find a definite answer.

Take for example the following line. This is the last reference to foo. When is the object pointed to by foo available for garbage collection?

ret = func(['xyz: ' + foo.name])

Breaking it down to the (possible) individual steps:

  1. temporary reference to name is created.
  2. 'xyz: ' is concatenated with name and value is returned.
  3. list is created with the new string.
  4. function is called with new array.
  5. function returns.
  6. result is assigned to ret.
  7. next instruction...

Between which two steps is the object first eligible to be collected? When is the reference count to the object decremented?

If the list of steps is incomplete/incorrect please let me know as well. I only attempted to enumerate them to give a common starting place for the potential answers to reference.

Dillon
  • 167
  • 1
  • 8
  • 1
    There are no arrays here. There is a list. – Gareth Latty Dec 03 '12 at 19:28
  • 1
    You might find this PyCon talk useful - he talks about the PyCon garbage collection model and how to debug and profile the guts of Python: http://www.youtube.com/watch?v=6jD34p8PokU – Rachel Sanders Dec 03 '12 at 20:09
  • Thank you @RachelSanders! This video looks to answer the question. If you would like to provide an answer I will mark it. Here is the link to the specific time. http://www.youtube.com/watch?feature=player_detailpage&v=6jD34p8PokU#t=1020s – Dillon Dec 03 '12 at 22:54
  • But the reference counter won't hit 0 as `foo` is not being deleted or going out of scope. – Gareth Latty Dec 03 '12 at 23:15
  • Aw sweet, I'm glad it helped. I'll put it in as an answer. – Rachel Sanders Dec 04 '12 at 00:33

2 Answers2

5

As in other garbage collected languages, the rule of thumb is: When it's unreachable. That means, is must not be gc'd as long as the program (any part of it) could still access it. After that, it's fair game. When (even whether) it is actually reclaimed is completely up to the implementation.

In your example, the name foo keeps the object alive, as the program could still use it in later statements. (Theoretically, an implementation may detect if a variable is no longer used, and remove that reference - possibly making the object unreachable sooner. Practically, this is impossible in Python, except perhaps in some traces compiled by a JIT compiler.) The list, in contrast, may become unreachable after the execution of func, or even during it, if func does not store references to it in some reachable locations (e.g. attributes of a reachable object, or a global variable).

Note that you mix up foo and the object it refers to, which is deadly when reasoning about garbage collection. There's no object foo, there's just an object which foo refers to at a given point in time (and thereby, a set of objects it refers to at distinct points in time). This matters:

  • foo may be changed to refer to another object and the object it originally referred to may become unreachable. You can't talk about such situations unless you differentiate between the two.
  • There may be many other references to the object foo points to (actually rather likely). There may keep the object alive long after foo went out of scope, was fed to del, or changed to refer to another object.
3

The variable will be eligible for garbage collection as soon as all references to it go out of scope or are manually deleted (del x).

In your example, foo must exist before this line (otherwise it's a NameError), and therefore will never be garbage collected in your example block of code, as the reference will still exist after this. Even if one were to call del foo after this, we would have to presume there were no references to the object anywhere else for it to be garbage collected.

Gareth Latty
  • 86,389
  • 17
  • 178
  • 183
  • An object can be shown eligible for garbage collection (or not) without saying anything about any specific garbage collector or whether it will actually be collected. –  Dec 03 '12 at 19:28
  • The reason I am asking this question is because I have seen objects being collected (__del__() being called) before step 7. This was rather unexpected and is why I am questioning when it is first eligible for collection. To be clear, I do not care when it is actually collected, but when it is first a candidate. – Dillon Dec 03 '12 at 20:11
  • 1
    @Dillon When have you seen that? It's definitely not possible in this case, so I think you were misinterpreting what was happening. – Gareth Latty Dec 03 '12 at 20:27
  • I am comparing 3 print statements that are printed in the following order. 1. print from __del__(self). 2. print from within func. 3. print at step 7. We are using CPython 2.6.8 – Dillon Dec 03 '12 at 22:00
  • It seems like the video @rachel-sanders posted has the best answer to this. In CPython, when the reference counter goes to 0, the object is immediately garbage collected. Link to that point in the video: http://www.youtube.com/watch?feature=player_detailpage&v=6jD34p8PokU#t=1020s – Dillon Dec 03 '12 at 22:23
  • 1
    Yes, I don't argue that - the issue is that the reference counter will not hit 0 here. – Gareth Latty Dec 03 '12 at 23:16
  • I apologize, I was misreading the logs. You are correct. The object was not being collected and would have only been done when the reference leaves the scope, is overwritten, or if explicitly deleted with del. – Dillon Dec 03 '12 at 23:39