2

I once read somewhere about a common thought amongst idealistic yet 'lazy' programmers trying their hand at implementing programming languages. It was as follows:-

'I know, I'll do an easy-to-implement and quick-to-write reference counting GCer, and then re-implement it as a real GCer when I get the time.'

Naturally, this reimplementation never occurs.

However, I question why such a reimplementation is needed. Why are mark-and-sweep collectors of the incremental and concurrent variety seen as superior to the allegedly antiquated approach taken by languages like Perl 5 and Python? (Yes, I'm aware Python augments this approach with a mark-and-sweep collector.)

Circular references is the first topic to come up under such discussion. Yes, it can be a pain (see recursive coderefs in Perl, and fix to it involving multiple assignments and reference weakening.) Yes it's not as elegant when a coder has to constantly monitor for references of that ilk.

Is the alternative any better though? We can discuss fine-grained implementation details for eternity, but the fact is, most mark-and-sweep GC implementations have the following problems:-

  • Non deterministic destruction of resources, leading to code that's hard to reason about and too verbose (see IDispose in .NET or the try/finally replacements in many other languages.)
  • Additional complexity with different categories of garbage, for short-lived, long-lived, and everything in-between, with such complexity seemingly required for reasonable performance.
  • Either another thread is required, or the execution of the program needs to be periodically halted to perform the collection.

Are the downfalls of mark-and-sweep justifiable to fix the issues of reference counting which are mitigatable with weak references?

Louis
  • 2,442
  • 1
  • 18
  • 15
  • "I once read somewhere" - where? Link please. – Oded Jan 20 '12 at 10:41
  • 2
    Can't remember. If it was making a precise technical point that would be a problem, but it's making a vague cultural point that I go on to elaborate, so it isn't particularly relevant IMO. – Louis Jan 20 '12 at 10:43
  • "Non deterministic destruction of resources". Reference counting is also non-deterministic in multithreaded programs as threads can race to decrement to zero. – J D Jun 19 '12 at 10:43
  • "such complexity seemingly required for reasonable performance". Even the most naive tracing collector with no such complexity is *much* faster than naive reference counting. To make reference counting faster to have to add complexity too, and the state-of-the-art reference counting collectors are significantly slower than the state-of-the-art tracing collectors. – J D Jun 19 '12 at 10:44
  • "the execution of the program needs to be periodically halted to perform the collection". That is incorrect. Collection can be fully incremental with no pauses. – J D Jun 19 '12 at 10:45

0 Answers0