6

If my understanding is correct, in CPython objects will be deleted as soon as their reference count reaches zero. If you have reference cycles that become unreachable that logic will not work, but on occasion the interpreter will try to find them and delete them (and you can do this manually by calling gc.collect() ).

My question is, when do these interpreter-triggered cycle collection steps happen? What kind of events trigger them?

I am more interested in the CPython case, but would love to hear how this differs in PyPy or other python implementations.

Martijn Pieters
  • 1,048,767
  • 296
  • 4,058
  • 3,343
toth
  • 2,519
  • 1
  • 15
  • 23
  • 1
    You may be interested in this link: http://stackoverflow.com/questions/4484167/details-how-python-garbage-collection-works?rq=1 –  Apr 17 '14 at 18:29
  • Thanks, somehow missed it when looking for answers on this topic. – toth Apr 17 '14 at 18:35

1 Answers1

11

The GC runs periodically based on the (delta between the) number of allocations and deallocations that have taken place since the last GC run.

See the gc.set_threshold() function:

In order to decide when to run, the collector keeps track of the number object allocations and deallocations since the last collection. When the number of allocations minus the number of deallocations exceeds threshold0, collection starts.

You can access the current counts with gc.get_count(); this returns a tuple of the 3 counts GC tracks (the other 2 are to determine when to run deeper scans).

The PyPy garbage collector operates entirely differently, as the GC process in PyPy is responsible for all deallocations, not just cyclic references. Moreover, the PyPy garbage collector is pluggable, meaning that how often it runs depends on what GC option you have picked. The default Minimark strategy doesn't even run at all when below a memory threshold, for example.

See the RPython toolchain Garbage Collector documentation for some details on their strategies, and the Minimark configuration options for more hints on what can be tweaked.

Ditto for Jython or IronPython; these implementations rely on the host runtime (Java and .NET) to handle garbage collection for them.

Martijn Pieters
  • 1,048,767
  • 296
  • 4,058
  • 3,343
  • Thank you, that's exactly what I was looking for. Are you aware if it is ever worthwhile to tune those numbers? – toth Apr 17 '14 at 18:34
  • 3
    @toth: Yes, if your application frequently creates and destroys a lot of objects with a very low circular reference incidence, you could lower the thresholds significantly to reduce the chances the GC takes unnecessary CPU time away from your app with too-frequent garbage collection runs. – Martijn Pieters Apr 17 '14 at 18:36
  • 1
    And by *lower* I mean increase them so they are not triggered quite as often. Sorry, that may have been confusing. – Martijn Pieters Apr 17 '14 at 18:49