6

I want to develop a multi-threaded C++ application (where eventually most of the C++ code would become generated by the application itself, which could be viewed as a high-level domain specific language) on Linux/AMD64/Debian with GCC 4.6 (and probably latest C++11 standard).

I really want to use Boehm's conservative garbage collector for all my heap allocations, because I want to allocate with new(GC) and never bother about delete. I am assuming that Boehm's GC is working well enough.

The main motivation for using C++ (instead of C) is all the algorithms and collections std::map ... std::vector provided by the C++ standard library.

Boehm's GC provide a gc_allocator<T> template (in its file gc/gc_allocator.h).

Should I redefine operator ::new as Boehm's one?

Or should I use all the collection templates with an explicit allocator template argument set to some gc_allocator? I don't understand exactly the role of the second template argument (the allocator) to std::vector? Is it used to allocate the vector internal data, or to allocate each individual element?

And what about std::string-s? How to make their data GC-allocated? Should I have my own string, using basic_string template with gc_allocator? Is there some way to get the internal array-s of char allocated with GC_malloc_atomic not GC_malloc ?

Or do you advise not using Boehm GC with an application compiled by g++ ?

Regards.

Talia
  • 1,400
  • 2
  • 10
  • 33
Basile Starynkevitch
  • 223,805
  • 18
  • 296
  • 547
  • 8
    It all depends on how familiar with and good at C++ you are. Can you write a decent C++ program without the use of `delete`, and do you realize that `new` should only be said in very, very special circumstances, and that pointers should not be needed for the most part? If you understand all this and conclude that you require a garbage collector, then by all means go ahead. On the other hand if you don't, you might discover that idiomatic, modern C++ is pretty good at user-friendly, deterministic memory management. – Kerrek SB Nov 04 '11 at 23:37
  • I do know garbage collection quite well, and I believe it is important. The application might become a translator (from some high-level language) to generated C++. And I am not so familiar with C++ to be able to write code without pointers. I am easy with generation of C code, since this is one of the main feature of my GCC MELT work. – Basile Starynkevitch Nov 04 '11 at 23:40
  • 5
    Hehe -- I may be out of line here, but it seems to me that if you *do* know garbage collection but *don't* know C++ very well, then this is a classic "I have a hammer" situation... Why not post a typical snippet of code, and we can see how to do it the C++ way? – Kerrek SB Nov 04 '11 at 23:41
  • The typical code would be like GCC MELT. It has a million line of generated C. (In other words, the typical code would be generated C++). – Basile Starynkevitch Nov 04 '11 at 23:43
  • 1
    I've never heard of MELT. But if it is something written in or for pure C, then the need for a garbage collector there would be much larger, since C doesn't offer any of the RAII tools that C++ does. So it's understandable that a project in C would be able to take advantage of a GC. But if you want to write genuine, modern C++, that's an entirely different battlefield. – Kerrek SB Nov 04 '11 at 23:46
  • MELT (http://gcc-melt.org/) is a Lispy high-level domain specific language to extend GCC, and translated to C. It is bootstrapped, MELT translator is coded in MELT. Look into http://gcc.gnu.org/viewcvs/branches/melt-branch/gcc/melt/ for some example of MELT source code, and into http://gcc.gnu.org/viewcvs/branches/melt-branch/gcc/melt/generated/ for example of their translated C equivalent. And I don't want to code a lot in C++, I want to have the C++ generated (my interest is in experimenting DSL-s translated to C or C++). – Basile Starynkevitch Nov 04 '11 at 23:51
  • 1
    I see. That's different from your initial claim, "I want to develop a multi-threaded C++ application", though! Now it sounds like you're mainly interested in C programming...? I cannot speak to that, but I imagine that BGC is a good choice for that. – Kerrek SB Nov 04 '11 at 23:57
  • No, I am interested in generating C or C++ code from some higher level domain specific language. But I am considering generating C++ ... – Basile Starynkevitch Nov 05 '11 at 00:00
  • 2
    Hm. You really have to pick one. They're different languages. When you generate C++ code and do it right, I'd say you don't need a garbage collector, and it's on *you* to prove why you *would* need one. (Your compiler may well use one, but I'm talking about the resulting code now.) – Kerrek SB Nov 05 '11 at 00:02
  • Because I am not able to do a fully compile-time garbage collector. – Basile Starynkevitch Nov 05 '11 at 00:05
  • The comments above are interesting, but they don't answer the (sub) question about the role of the allocator template argument in STL collections. – Basile Starynkevitch Nov 05 '11 at 00:10
  • 3
    Frankly, I think there are too many questions. It's best to be focused and specific. To be brutally honest, if you're not happily at home with C++ allocator mechanics, it's probably premature for you to design a C++ code generation tool, and you should have a thorough grasp of the language you're translating into before approaching such a venture. That's just my personal impression, and I do wish you best of luck and courage for this -- I just think that learning more C++ for a while would be a win-win-win for everyone. – Kerrek SB Nov 05 '11 at 00:13
  • 5
    Simply put, there's no reason at all to use `new` directly, and *never* a reason to `delete` anything. Judging by your post, I am left with the distinct impression that you have absolutely no idea how to write C++ code at all, nor use the constructs provided as Standard, and are using GC simply because it's the paradigm you're used to. – Puppy Nov 05 '11 at 00:58
  • 1
    Show me an example of an interpreter coded in C++ for a high-order language (with recursive lambda & letrec, i.e. a micro Scheme) without any GC. – Basile Starynkevitch Nov 05 '11 at 07:13
  • 2
    I don't understand the reasoning why standard library container would need GC allocators, they can manage their own memory just fine. Likewise do smart pointer. In idiomatic, modern C++, there is very rarely a reason to explicitly (de)allocate memory via `new`/`delete`. – Xeo Nov 05 '11 at 08:54
  • Simply, because one can need GC for its own classes (again, the good example is to code a high-order functional toy language, like a micro Scheme or a toy ML), and unless you implement your own GC which calls the C++ destructors, you cannot use STL containers & allocators). In practice, if I define a garbage collected Value class [hierarchy] containing map-s and vector-s of Value-s pointers, I cannot ensure that they will be destroyed. Again, show me please an example of an interpreter (of a language with recursive anonymous lambda-s & letrec-s) coded in C++ which don't have any kind of GC. – Basile Starynkevitch Nov 05 '11 at 09:21
  • 1
    @BasileStarynkevitch A few years late, but cpython implements all of that without using garbage collection within the C++ code (it does, of course, implement a garbage collector on the python code). – mbrig Feb 22 '18 at 22:16

1 Answers1

5

To answer partly my own question, the following code

// file myvec.cc
#include <gc/gc.h>
#include <gc/gc_cpp.h>
#include <gc/gc_allocator.h>
#include <vector>

class Myvec {
  std::vector<int,gc_allocator<int> > _vec;
public:
  Myvec(size_t sz=0) : _vec(sz) {};
  Myvec(const Myvec& v) : _vec(v._vec) {};
  const Myvec& operator=(const Myvec &rhs) 
    { if (this != &rhs) _vec = rhs._vec; return *this; };
  void resize (size_t sz=0) { _vec.resize(sz); };
  int& operator [] (size_t ix) { return _vec[ix];};
  const int& operator [] (size_t ix) const { return _vec[ix]; };
  ~Myvec () {};
};

extern "C" Myvec* myvec_make(size_t sz=0) { return new(GC) Myvec(sz); }
extern "C" void myvec_resize(Myvec*vec, size_t sz) { vec->resize(sz); }
extern "C" int myvec_get(Myvec*vec, size_t ix) { return (*vec)[ix]; }
extern "C" void myvec_put(Myvec*vec, size_t ix, int v) { (*vec)[ix] = v; }

when compiled with g++ -O3 -Wall -c myvec.cc produces an object file with

 % nm -C myvec.o
                 U GC_free
                 U GC_malloc
                 U GC_malloc_atomic
                 U _Unwind_Resume
0000000000000000 W std::vector<int, gc_allocator<int> >::_M_fill_insert(__gnu_cxx::__normal_iterator<int*, std::vector<int, gc_allocator<int> > >, unsigned long, int const&)
                 U std::__throw_length_error(char const*)
                 U __gxx_personality_v0
                 U memmove
00000000000000b0 T myvec_get
0000000000000000 T myvec_make
00000000000000c0 T myvec_put
00000000000000d0 T myvec_resize

So there is no plain malloc or ::operator new in the generated code.

So by using gc_allocator and new(GC) I apparently can be sure that plain ::opertor new or malloc is not used without my knowledge, and I don't need to redefine ::operator new


addenda (january 2017)

For future reference (thanks to Sergey Zubkov for mentioning it on Quora in a comment), see also n2670 and <memory> and garbage collection support (like std::declare_reachable, std::declare_no_pointers, std::pointer_safety etc...). However, that has not been implemented (except in the trivial but acceptable way of making it a no-op) in current GCC or Clang at least.

Basile Starynkevitch
  • 223,805
  • 18
  • 296
  • 547