39

I'm experimenting with writing some bindings to the Boehm GC for Rust.

Some background: Rust is designed to be a high-concurrent language, and a result of this design is having the ability to statically restrict GC pointers to within the threads in which they were allocated (that is, a GC pointer allocated in thread x can never be kept alive (or even referenced at all) by another thread).

Hence, I wish to drive Boehm to capitalise on this for performance as much as possible:

  1. thread-safe, so I can allocate and collect from multiple threads
  2. stop-as-little-as-possible collections (i.e. just the current thread), other threads can keep running because they can't possibly interfere with anything relevant to the GC pointers outside of themselves
  3. preferably, entirely thread-locally with no synchronisation between the GC "instances" of different threads

1 is easy, but I can't find any facility for 2 and 3. The most important part is 1 & 2 because I want to be able to have threads running in the background, independently of what the other threads are doing (even if they are all allocating and garbage-collecting gigabytes of memory).

(I do know about THREAD_LOCAL_ALLOC & gc_thread_local.h, but that doesn't quite satisfy 3 fully, it just makes it more efficient, but it is still valid to transfer the pointers allocated thread-locally between threads, while I don't need that guarantee.)

rogerdpack
  • 62,887
  • 36
  • 269
  • 388
huon
  • 94,605
  • 21
  • 231
  • 225
  • 3
    Are you committed to using Boehm, or are you willing to consider other open-source conservative GC's out on the market? E.g. you might be able to hack Tamarin's MMgc into something suitable for your needs here. (My memory is that MMgc allows for per-thread GC objects, each with its own roots and object graph.) – pnkfelix Jan 06 '14 at 09:43
  • 3
    (As a follow-up on MMgc: there is still global cross-thread state in a GCHeap class and also a global pagemap; I'm not sure how far you intend your third criteria to go. also, there's the problem that adobe is unlikely to provide much support for this project.) – pnkfelix Jan 06 '14 at 09:51
  • 2
    @pnkfelix I'm not committed to anything, I'm just experimenting with "easy" GCs (and I might as well do it in the context of Rust, even though people like you know far far more than I :) ). That looks like it may be feasible, but I'm not very interested in writing a C interface to use via FFI (though, I'll definitely keep it in mind as a possibility to investigate, thanks). In any case, I'm now hacking up a pure Rust GC; easier to get all the requirements above and more fun anyway: but much harder to get to be anywhere near as fast, so I'm still interested in any answers to this question. – huon Jan 07 '14 at 00:54
  • In case it's new since the question, just for followers: `-DPARALLEL_MARK` https://github.com/ivmai/bdwgc/blob/master/doc/scale.md there's also incremental with time limit (at the expense of parallel mark) https://github.com/ivmai/bdwgc/commit/3c571c7ad66a90e33e4701afe3dc4d2113c60adc – rogerdpack Sep 10 '19 at 16:14

2 Answers2

8

I don't have an answer about how to do this with Boehm. However, here are two GCs which seem to have enough control and encapsulation to have a totally independent GC context per-thread.

rogerdpack
  • 62,887
  • 36
  • 269
  • 388
David Jeske
  • 2,306
  • 24
  • 29
0

Facility 3 seems to be implemented in a Boehm GC fork by declaring each global variable of the collector as thread-local one - https://github.com/Samsung/gcutil/commit/0cc277fb0cef82d515cc4ff4a439e50568474e16

Ivan Maidanski
  • 251
  • 2
  • 2