The garbage collector works in three basic steps:
- Mark all objects that are still alive.
- Collect the objects that are not marked as alive.
- Compact the memory.
Your concern is step 1: How does the GC figure out that it shouldn't collect objects behind ref
and out
params?
When the GC performs a collection, it starts with a state where none of the objects is considered alive. It then goes from the root references and marks all those objects as alive. Root references are all references on the stack and in static fields. Then the GC goes recursively into the marked objects and marks all objects as alive that are referenced from them. This is repeated until no objects are found that are not already marked as alive. The result of this operation is an object graph.
A ref
or out
parameter has a reference on the stack, and so the GC will mark the respective object as alive, because the stack is a root for the object graph.
At the end of the process, the objects with only internal references are not marked, because there is no path from the root references that would reach them. This takes care of all circular references, too. These objects are considered dead and will be collected in the next step (that includes calling the finalizer, even though there is no guarantee for that).
At the end, the GC will move all alive objects to a continuous area of memory at the beginning of the heap. The rest of the memory will filled with zeroes. That simplifies the process of creating new objects, since their memory can always be allocated at the end of the heap and all fields already have the default values.
It is true that the GC needs some time to do all of this, but it still does it reasonably fast, due to some optimizations. One of the optimizations is to separate the heap into generations. All newly allocated objects are generation 0. All objects surviving the first collection are generation 1 and so forth. Higher generations are only collected if collecting lower generations does not free up enough memory. So, no, the GC does not always have to scan the entire heap.
You have to consider that, while the collection takes some time, allocating new objects (which happens much more often than a garbage collection) is much faster than in other implementations, where the heap looks more like a swiss cheese and you need some time to find a hole big enough for the new object (which you still need to initialize).