How to schedule collection cycles for custom mark-sweep collector?

Question

I've written a simple garbage collector for a Postscript virtual machine, and I'm having difficulty designing a decent set of rules for when to do a collection (when the free list is too short?) and when to allocate new space (when there's a lot of space to use?).

I've written bottom-up so far, but this question involves top-level design. So I feel I'm on shaky ground. All objects are managed and access is only through operator functions, so this is a collector in C, not for C.

The primary allocator function is called gballoc:

unsigned gballoc(mfile *mem, unsigned sz) {
    unsigned z = adrent(mem, FREE);
    unsigned e;
    memcpy(&e, mem->base+z, sizeof(e));
    while (e) {
        if (szent(mem,e) >= sz) {
            memcpy(mem->base+z, mem->base+adrent(mem,e), sizeof(unsigned));
            return e;
        }
        z = adrent(mem,e);
        memcpy(&e, mem->base+z, sizeof(e));
    }
    return mtalloc(mem, 0, sz);
}

I'm sure it's gibberish without knowing what all the types and functions mean, so here's pseudocode of the same function:

gballoc
    load free list head into ptr
    while ptr is not NULL
        if free element size is large enough
            return element, removed from list
        next ptr
    fallback to allocating new space

So it's a simple "first-fit" algorithm with no carving (but allocations retain their size; so a large space reused for a small object can be reused for a large object again, later).

But when should I call collect()?

Edit: The rest of the code and related modules have been posted in comp.lang.postscript, in the thread: http://groups.google.com/group/comp.lang.postscript/browse_thread/thread/56c1734709ee33f1#

score 2 · Accepted Answer · answered Jun 16 '11 at 06:16

2

There are several applicable philosophies:

Do garbage collection as last-ditch avoidance of expanding the heap during an allocation. This is probably the most common strategy.
Do garbage collection periodically, like every hundredth allocation or deallocation. In some situations, this might decrease the overall effort of garbage collection by not letting fragmentation get out of hand.
Don't do any garbage collection. Always a possible strategy, especially for short-lived or simple programs.

As a developer of garbage collection, it might be desirable to give the choice of strategy to the application since it might know which will be most effective. Of course, if it doesn't have a preference, you should choose a default.

answered Jun 16 '11 at 06:16

wallyk

56,922
16
83
148

1

Yes, there is a user-level operator `vmreclaim` that (en/dis)ables automatic collection, or requests immediate collection. So, your third option is covered already in the spec. Periodical seems like a reasonable choice for now. I worry about start-up thrashing with the last-ditch strategy unless there's a generous initial pool. – luser droog Jun 16 '11 at 06:32
I was hoping to scoop up a few more answers to choose from; but this gives what I need for now, and fodder for future thought. So I accept. Thank You. – luser droog Jun 17 '11 at 01:04

score 1 · Answer 2 · answered Jun 17 '11 at 01:09

Here is the periodic collection strategy incorporated into the original code:

enum { PERIOD = 10 };

unsigned gballoc(mfile *mem, unsigned sz) {
    unsigned z = adrent(mem, FREE);
    unsigned e;
    static period = PERIOD;
    memcpy(&e, mem->base+z, sizeof(e));
try_again:
    while (e) {
        if (szent(mem,e) >= sz) {
            memcpy(mem->base+z, mem->base+adrent(mem,e), sizeof(unsigned));
            return e;
        }
        z = adrent(mem,e);
        memcpy(&e, mem->base+z, sizeof(e));
    }
    if (--period == 0) {
        period = PERIOD;
        collect(mem, 0);
        goto try_again;
    }
    return mtalloc(mem, 0, sz);
}

Don't tell anybody, but this doesn't actually work. The second argument to collect should be the address of a stack structure containing the root set. But I'm going to have to make that a range of special entities pointing to several stacks. And different sets for different memory files. But at least the call is in place. — luser droog, Jun 21 '11 at 06:25

How to schedule collection cycles for custom mark-sweep collector?

2 Answers2

Linked