The Boehm GC is a conservative collector, which means it assumes everything (on pointer boundaries) is a pointer. This means that it can find false positive references, like an integer which coincidentally has the value of an address in the heap. As a result, some blocks may stay in memory longer than they would with a non-conservative collector.
Here's a description from Boehm's page:
The garbage collector uses a modified
mark-sweep algorithm. Conceptually it
operates roughly in four phases, which
are performed occasionally as part of
a memory allocation:
- Preparation Each object has an associated mark bit. Clear all mark
bits, indicating that all objects are
potentially unreachable.
- Mark phase Marks all objects that can be reachable via chains of
pointers from variables. Often the
collector has no real information
about the location of pointer
variables in the heap, so it views all
static data areas, stacks and
registers as potentially containing
pointers. Any bit patterns that
represent addresses inside heap
objects managed by the collector are
viewed as pointers. Unless the client
program has made heap object layout
information available to the
collector, any heap objects found to
be reachable from variables are again
scanned similarly.
- Sweep phase Scans the heap for inaccessible, and hence unmarked,
objects, and returns them to an
appropriate free list for reuse. This
is not really a separate phase; even
in non incremental mode this is
operation is usually performed on
demand during an allocation that
discovers an empty free list. Thus the
sweep phase is very unlikely to touch
a page that would not have been
touched shortly thereafter anyway.
- Finalization phase Unreachable objects which had been registered for
finalization are enqueued for
finalization outside the collector.
You should also know that the Boehm GC needs to be given a set of "roots", which are starting points for the mark-and-sweep algorithm. The stack and registers are automatically roots. You need to explicitly add global pointers as roots.
EDIT: In comments, some concerns were pointed out about conservative collectors in general. It is true that integers that look like heap pointers to the collector can cause memory not to be released. This is not as big of a problem as you might think. Most scalar integers in a program are used for counts and sizes and are fairly small (so they would not look like heap pointers). You would mostly run into problems with arrays containing bitmaps, strings, floating point data, or anything of that sort. Boehm GC lets you allocate a block with GC_MALLOC_ATOMIC
which indicates to the collector that the block will not contain any pointers. If you look in gc_typed.h, you will also find ways to specify what parts of a block may contain pointers.
That said, a fundamental limitation of a conservative collector is that it cannot safely move memory around during collection since pointer rewriting is not safe. This means you won't get any of the benefits of compaction like lowered fragmentation and improved cache performance.