I once read somewhere about a common thought amongst idealistic yet 'lazy' programmers trying their hand at implementing programming languages. It was as follows:-
'I know, I'll do an easy-to-implement and quick-to-write reference counting GCer, and then re-implement it as a real GCer when I get the time.'
Naturally, this reimplementation never occurs.
However, I question why such a reimplementation is needed. Why are mark-and-sweep collectors of the incremental and concurrent variety seen as superior to the allegedly antiquated approach taken by languages like Perl 5 and Python? (Yes, I'm aware Python augments this approach with a mark-and-sweep collector.)
Circular references is the first topic to come up under such discussion. Yes, it can be a pain (see recursive coderefs in Perl, and fix to it involving multiple assignments and reference weakening.) Yes it's not as elegant when a coder has to constantly monitor for references of that ilk.
Is the alternative any better though? We can discuss fine-grained implementation details for eternity, but the fact is, most mark-and-sweep GC implementations have the following problems:-
- Non deterministic destruction of resources, leading to code that's hard to reason about and too verbose (see IDispose in .NET or the try/finally replacements in many other languages.)
- Additional complexity with different categories of garbage, for short-lived, long-lived, and everything in-between, with such complexity seemingly required for reasonable performance.
- Either another thread is required, or the execution of the program needs to be periodically halted to perform the collection.
Are the downfalls of mark-and-sweep justifiable to fix the issues of reference counting which are mitigatable with weak references?