Why is garbage collection necessary?

Question

Suppose that an object on the heap goes out of scope. Why can't the program free the memory right after the scope ends? Or, if we have a pointer to an object that is replaced by the address to a new object, why can't the program deallocate the old one before assigning the new one? I'm guessing that it's faster not to free it immediately and instead have the freeing be done asynchronously at a later point in time, but I'm not really sure.

You're only considering the simplest, hierarchical allocate-use-free scenarios here, but most dynamic objects live their lives as part of complex data structures like trees, hash tables, or graphs of various kinds where each object's lifetime is not in sync with the sequential order of events in the program. — 500 - Internal Server Error, May 25 '20 at 10:04

Stephen C · Answer 1 · 2020-05-27T23:02:29.260

Why is garbage collection necessary?

It is not strictly necessary. Given enough time and effort you can always translate a program that depends on garbage collection to one that doesn't.

In general, garbage collection involves a trade-off.

On the one hand, garbage collection allows you to write an application without worrying about the details of memory allocation and deallocation. (And the pain of debugging crashes and memory leaks caused by getting the deallocation logic wrong.)

The downside of garbage collection is that you need more memory. A typical garbage collector is not efficient if it doesn't have plenty of spare space¹.

By contrast, if you do manual memory management, you can code your application to free up heap objects as soon as they are no longer used. Furthermore, you don't get awkward "pauses" while the GC is doing its thing.

The downside of manual memory management is that you have to write the code that decides when to call free, and you have to get it correct. Furthermore, if you try to manage memory by reference counting:

you have the cost of incrementing and decrementing ref counts whenever pointers are assign or variables go out of scope,
you have to deal with cycles in your data structures, and
it is worse when your application is multi-threaded and you have to deal with memory caches, synchronization, etc.

For what it is worth, if you use a decent garbage collector and tune it appropriately (e.g. give it enough memory, etc) then the CPU costs of GC and manual storage management are comparable when you apply them to a large application.

Reference:

"The measured cost of conservative garbage collection" by Benjamin Zorn

^{1 - This is because the main cost of a modern collector is in traversing and dealing with the non-garbage objects. If there is not a lot of garbage because you are being miserly with the heap space, the GC does a lot of work for little return. See https://stackoverflow.com/a/2414621/139985 for an analysis.}

Why can't the compiler do the manual memory management? You avoid the downsides of garbage collection, and there is no risk of getting the memory management wrong since the compiler theoretically shouldn't make any mistakes. — Matthew Inbox, May 25 '20 at 10:14
Because the compiler can't do it for most programming languages. It involves runtime computation, for all but the most simple uses of dynamic storage. (And there is also the previously mentioned problem of reference cycles.) But take a look at the way that Rust handles this: https://doc.rust-lang.org/1.22.0/book/first-edition/the-stack-and-the-heap.html, http://willcrichton.net/notes/rust-memory-safety/ — Stephen C, May 25 '20 at 10:21
Very interesting link. I'm surprised, because some people (well, millenial programmers probably) think backwards and claim that GC'ed languages are design mistakes. Any other insight to the historical aspect than this paper? — mavavilj, Dec 04 '22 at 20:47

score 0 · Answer 2 · answered May 25 '20 at 11:08

It's more complicated, but

1) what if there is memory pressure before the scope is over? Scope is only a language notion, not related to reachability. So an object can be "freed" before it goes out of scope ( java GCs do that on regular basis). Also, if you free objects after each scope is done, you might be doing too little work too often

2) As far as the references go, you are not considering that the reference might have hierarchies and when you change one, there has to be code that traverses those. It might not be the right time to do it when that happens.

In general, there is nothing wrong with such a proposal that you describer, as a matter of fact this is almost exactly how Rust programming language works, from a high level point of view.

Why is garbage collection necessary?

2 Answers2