3

I apologize if the answer to this question is trivial. But I still cannot figure out this by myself.

How does the garbage collector in .NET identify what objects on the heap are garbage and what objects are not?

Lets say a .NET application is running and at a certain point of time garbage collection occurs(lets leave out the generations and finalization queue for simplicity sake).

Now the application may have:

  1. stack variables pointing to objects on heap.
  2. registers containing addresses of objects on heap.
  3. Static variables pointing to objects on heap.

This is how I ASSUME the GC works.

  1. It de-references each such address and ends up at the object on the heap.
  2. It marks the object as not garbage (by using the sync block index) since some variable is still pointing to it.
  3. It does this operation for all the addresses(referred to as roots for some reason in most articles)
  4. Now since the .NET runtime has information about the TYPE of each object, it can calculate the size of each object and hence the block of heap memory it occupies. For all the marked objects, it leaves the block of memory occupied as it is.
  5. The rest of the memory is freed, compacted and the if necessary the other objects are relocated(and their addresses updated).

Am I correct in my understanding?

Prashanth
  • 2,404
  • 1
  • 17
  • 19

1 Answers1

0

You are right in some cases. The GC looks through the heap pessimistically - i.e. it sets off assuming everything (in Generation 0) will be GCed.

It literally goes through everything on the heap through a first sweep called "marking", in which is checks if anything is referencing it. Since they are all reference types and some reference others, it will recursively navigate the references. Don't worry - there is logic to not get into an infinite loop!

If it finds an object is not referenced, it will firstly mark it, by setting a flag within the object called the sync block index.

After going through every object on the heap, it will then begin a process called "compacting" which is when it shifts all of the remaining objects into the same area of memory, leaving the memory above clear. It will keep the objects of the same generation together as they are statistically more likely to be de-referenced at the same time.

This therefore will reduce the memory needed.

Garbage Collection doesn't necessarily speed up your program, but does allow it to re-use the space occupied by unused objects.

There are many many articles on the subject. I personally like "CLR via C#" by Jeffrey Richter who gives an excellent chapter on how it works.

Dominic Zukiewicz
  • 8,258
  • 8
  • 43
  • 61