68

The Boehm's conservative garbage collector is quite useful (e.g. Bigloo is using it, Guile is using something similar, etc....), notably on Linux (which is the only OS I care about; I'm using Debian/Sid/x86-64 if that matters, and libgc-dev package is version 1:7.4.2-8 so the Boehm GC is 7.4.2).

However, Boehm's GC requires to be aware of every thread using it. Its gc_pthreads_redirects.h (more or less internal) header file is redefining pthread_create as

# define pthread_create GC_pthread_create

Actually, what Boehm's GC needs is GC_register_my_thread to be called early in the new thread call stack (and GC_pthread_create is doing that).

In the past, Glib (2.46) provided a way to redefine memory allocation using struct GMemVTable which is deprecated and cannot be used anymore (my Debian's libglib2.02.0-dev package is version 2.50.3-2). There is a g_mem_gc_friendly global boolean but when looking into the Glib source code, it simply clears memory zones before freeing them.

Recent GTK3 (my libgtk-3-dev package has version 3.22.11-1) are creating threads (for something probably related to Dbus, and perhaps also to GtkTextView...) using (indirectly) pthread_create thru Glib thread functions. And there is no way (except by patching the source code) to be notified of that thread creation. I'm afraid that any GTK callback I would install (e.g. using g_signal_connect) might be called from these threads. Or that if I subclass a GTK widget with some methods which might use (or access) some GC_malloc-ed buffer there could be a disaster.

On the other hand, there is a strong coding rule in GTK that all GTK operations should happen only in the main thread. To quote Gdk3 Threads page:

GTK+, however, is not thread safe. You should only use GTK+ and GDK from the thread gtk_init() and gtk_main() were called on. This is usually referred to as the “main thread”.

If I follow this rule myself, I am sure that no internal GTK code will ever call my callbacks (using Boehm GC) from some non-main thread?

My intuition is that if ever GC_alloc is called from outside the main thread by GTK internals (not directly by my code) a disaster would happen (because these GTK-internal threads have not been started with GC_pthread_create; there might call some of my code, e.g. because I am subclassing some existing GTK widget, or because I connected some GTK signal, even if I don't myself code things using GTK & Boehm GC outside of the main thread.).

The point is that Boehm's GC needs to scan every stack in every thread possibly using it.

FWIW, I reported a possible bug#780815 on GTK Bugzilla.

A typical example is gtk+-3.22.11/examples/application9/ from GTK-3.22.11 tarball. pthread_create is called very indirectly by g_application_run via g_bus_get_sync

Jonathan Leffler
  • 730,956
  • 141
  • 904
  • 1,278
Basile Starynkevitch
  • 223,805
  • 18
  • 296
  • 547
  • 3
    Since GTK+ is not thread safe in general (that is, the GTK+ functions are not thread safe, and assume the objects they refer to and access are accessed in a single thread only), the only way I can see that GTK+ callbacks can be safe, is if they are executed in the "main thread". Do you disagree? – Nominal Animal Apr 01 '17 at 08:48
  • Exactly how (could they run mutex-locked in another thread)? Because the GTK+ functions are not thread-safe, and there are no limitations on what objects a GTK+ signal handler can access (AFAIK), the thread running the GTK+ signal handler must be the "main thread"; otherwise such a handler is limited to accessing only the mutex-protected objects. I have not seen such limitations documented. So, what basis do you have to assume the *"might run mutex-locked"*? (Object access is not limited to GTK+ functions, after all; a global GTK+ mutex would only protect against GTK+ functions' access?) – Nominal Animal Apr 01 '17 at 15:04
  • I forgot the details. But I did run gdb on some GtkTextView code a few years ago, and was suprised by several threads. – Basile Starynkevitch Apr 01 '17 at 15:05
  • I don't doubt that. I just mean that if GTK+ used several threads to deliver GTK+ signals, it would have to use a global mutex that is taken whenever *any* application code is executed. Since that lock might stay taken for long periods, I don't see that approach working well. (I haven't looked at how GTK+ signal delivery is implemented.) – Nominal Animal Apr 01 '17 at 15:58
  • 2
    My question was pointless, anyway: there is no reason to assume the extra threads do not do memory management related stuff. In fact, that is probably exactly what they are intended to do: transfer data to/from GTK+ structures, asynchronously. Which means you do need these extra threads to be created via `GC_pthread_create()` anyway. Unhappy situation. – Nominal Animal Apr 01 '17 at 16:03
  • 1
    …well, I understand that what follows is a rather ugly hack, but: why not just [hook the required functions](https://github.com/kubo/plthook) and replace them with proper trampolines that call whatever Boehm needs? Thread creation is expensive anyway, so I doubt that`d affect performance. – hidefromkgb Jul 20 '17 at 16:24
  • @hidefromkgb: The project you linked to is a monstrosity of undefined behavior that one cannot rely on to work as intended. Of course Boehm GC is such a beast too... – R.. GitHub STOP HELPING ICE Dec 01 '20 at 20:40
  • Inkscape 1.2 seems to use gtk 3.24 and bdw gc, not sure if there was magic sauce in there? https://wiki.inkscape.org/wiki/Tracking_Dependencies – rogerdpack Nov 06 '21 at 19:46

2 Answers2

1

Gtk does not call any signal handler from a non-main thread. Any worker thread you found in a Gtk application interacts with the main thread via a message queue. You can see that in the Glib source. For example, see the implementation of g_dbus_connection_signal_subscribe() and schedule_callbacks() in gdbusconnection.c.(A worker thread calls g_source_attach(..., subscriber->context), where the second argument is what g_main_context_ref_thread_default() returned.)

So you don't need to override memory allocation routines with g_mem_set_vtable(). If you did that before, that was a very poor design, where you replaced the perfect manual memory management implementation in GLib/Gtk with an automatic but imperfect(and non-stable) memory management scheme, Boehm GC.

relent95
  • 3,703
  • 1
  • 14
  • 17
0

Is recent GTK 3.22 still Boehm GC friendly (thread issue)?

TL;DR: not particularly, nor was it ever.


As far as I am aware, GTK was never particularly friendly to the Boehm GC. Recent versions definitely seem not to be.

All of the GTK shared libraries in my GTK2 and GTK3 installations are dynamically linked against libpthread, which tells me that although they may not all make direct calls to pthreads functions, they all at least depend on a library that does. There is therefore every reason to think that GTK or a closely associated library such as GLib will start internal threads under some circumstances. At the same time, none of the libraries are dynamically linked against libgc, so we can be pretty confident that internal threads started by GTK do not make any effort to register themselves with the GC.

I have no insight specifically into what any internal threads may do, but there is good reason to think that they sometimes will store and access pointers to objects provided by the client application, in memory that is not monitored by the GC. This opens a door for premature collection of such objects if they were allocated via GC_malloc(). (So now-deprecated support for injecting custom allocators was never sufficient to make a GTK application GC-safe.) That's essentially the same as your observation:

The point is that Boehm's GC needs to scan every stack in every thread possibly using it.

That's also consistent with the GC docs:

It is usually best not to mix garbage-collected allocation with the system malloc-free. If you do, you need to be careful not to store pointers to the garbage-collected heap in memory allocated with the system malloc.

Moreover, that means that this question is moot:

If I [avoid calling gtk and gdk functions other than from the main thread], I am sure that no internal GTK code will ever call my callbacks (using Boehm GC) from some non-main thread?

Even if we assume (reasonably) that internal threads never call your application callbacks, that does not imply that it is safe for the GC to ignore their stacks, thread-local storage, or per-thread memory-allocation arenas.

But to answer the question, it is part of GTK's contract with the programmer that callbacks may call GTK functions. Since that's supposed to be done only in the main thread, I expect that callbacks will be called only in the main thread. I don't find that officially documented, but it is asserted also in the comments thread of the issue you raised against GTK.

My intuition is that if ever GC_alloc is called from outside the main thread by GTK internals (not directly by my code) a disaster would happen []because these GTK-internal threads have not been started with GC_pthread_create [...].

That's plausible, but I think it's safe to assume that the situation will not arise from GTK calling your callbacks. As already noted, however, that's not sufficient to make GTK GC-friendly.

On some systems, you could probably force GTK to use GC_pthread_create, GC_malloc, GC_realloc, and GC_free in place of their standard counterparts by suitable dynamic linker tactics. Some systems may also provide special facilities for substituting the allocation functions at runtime. I think these approaches could make it safe to use GC in conjunction with Gtk / GLib, though those components will get no benefit from it. But I wouldn't call that "friendly" in either direction.

John Bollinger
  • 160,171
  • 8
  • 81
  • 157