3

I'm making a simple plugin framework in which I'd like to be able to dlopen() a shared library (i.e. plugin), inspect and use whatever factory functions is provides and eventually dlclose() it, leaving no trace.

My factory system is trivial, with a single exported function that returns a pointer to a common Base class. To check the plugin has been unloaded properly, I have a static object whose destructor sets a bool from the main program.

Here's the main program:

// dltest.cpp follows. Compile with g++ -std=c++0x dltest.cpp -o dltest -ldl
#include <dlfcn.h>
#include <iostream>
using namespace std;
int main(int argc, char** argv)
{
    if (argc > 1)
    {
        void* h = dlopen(argv[1], RTLD_NOW|RTLD_LOCAL);
        if (!h)
        {
            cerr << "ERROR: " << dlerror() << endl;
            return 1;
        }
        bool isFinilized = false;
        *(bool**)dlsym(h, "g_finilized") = &isFinilized;
        cout << boolalpha << isFinilized << endl;
        if (dlclose(h))
        {
            cerr << "ERROR: " << dlerror() << endl;
            return 2;
        }
        cout << boolalpha << isFinilized << endl;
    }
    return 0;
}

And the plugin's code is:

// libempty.cpp follows. Compile with g++ -std=c++0x libempty.cpp -o libempty.so -fPIC -shared
#include <iostream>
#include <vector>
using namespace std;
bool* g_finilized = nullptr;
struct Finilizer
{
    ~Finilizer()
    {
        cout << "~Finilizer()" << endl;
        if (g_finilized) *g_finilized = true;
    }
} g_finilizer;
class Base
{
public:
    virtual void init() = 0;
};
class Foo: public Base
{
    virtual void init()
    {
        static const vector<float> ns = { 0.f, 0.75f, 0.67f, 0.87f };
    }
};
extern "C" __attribute__ ((visibility ("default"))) Base* newBase() { return new Foo; }

If executed, the output is:

false
false
~Finilizer()

This shows the call to dlclose() doesn't work as expected and the library was not unloaded until the program's exit.

However, if we move the vector to outside the function, so the last 8 lines read:

class Foo: public Base
{
    virtual void init()
    {
    }
};
static const vector<float> ns = { 0.f, 0.75f, 0.67f, 0.87f };
extern "C" __attribute__ ((visibility ("default"))) Base* newBase() { return new Foo; }

Then dlclose() works properly and the output is:

false
~Finilizer()
true

The same results are generated if the vector is left in the function but no factory is exported:

class Foo: public Base
{
    virtual void init()
    {
        static const vector<float> ns = { 0.f, 0.75f, 0.67f, 0.87f };
    }
};
//extern "C" __attribute__ ((visibility ("default"))) Base* newBase() { return new Foo; }

Positive results are found if the vector is substituted with a C array:

class Foo: public Base
{
    virtual void init()
    {
        static const float ns[] = { 0.f, 0.75f, 0.67f, 0.87f };
    }
};
extern "C" __attribute__ ((visibility ("default"))) Base* newBase() { return new Foo; }

Is this a bug in GCC/Linux? Is there any workaround so that complex objects may be static-declared in an factorified class's member function?

ildjarn
  • 62,044
  • 9
  • 127
  • 211
gavwould
  • 53
  • 3
  • Where did you get the idea that one behavior is "correct" and the other one isn't? I don't think POSIX.1-2001 makes any guarantees one way or the other. – Nikolai Fetissov Jun 15 '12 at 16:23
  • The assembler code generated in both cases differs significantly. When `ns` is defined as a static local variable a lot of additional handlers are registered with `__cxa_atexit()` and tracing `ld-linux.so` behaviour shows that additional symbols are being resolved. Unfortunately Intel C++ does not support initialiser lists and it's hard to test if this behaviour is specific to GCC only. – Hristo Iliev Jun 15 '12 at 16:48
  • @NikolaiNFetissov: According to the man page: "The function dlclose() decrements the reference count on the dynamic library handle handle. If the reference count drops to zero and no other loaded libraries use symbols in it, then the dynamic library is unloaded." I see no obvious reason why it should be unloaded in some test cases and not in others. – gavwould Jun 15 '12 at 16:49
  • @HristoIliev: Note the initializer list isn't (or at least shouldn't be) required for the fail-case - a simple constructor (or perhaps even the default constructor) is probably enough to cause it to fail. – gavwould Jun 15 '12 at 16:52
  • OK, removed the initialisation and recompiled the module library with Intel C++ Compiler (`icpc`) - with `g++` behaviour is still different in each case while with `icpc` the library gets unloaded at `dlclose()` time in both cases. So yes, the behaviour is specific to GCC but I cannot assert that this is a bug. – Hristo Iliev Jun 15 '12 at 16:56
  • @HristoIliev: Note that according to the Open Group's standard, dlclose returns a non-zero when it can't properly close the library and zero when it can. Since it isn't being closed properly, then surely (if nothing else) the zero return is erroneous? – gavwould Jun 15 '12 at 17:04
  • The manual basically tells you not to rely on `dlclose(3)` to invoke any specific cleanup functions. If you want to be sure - do it yourself. – Nikolai Fetissov Jun 15 '12 at 18:11
  • @NikolaiNFetissov: Could you give a specific, definitive and authoritative reference about the unreliability of dlclose()? I find none on my system's man page. (Running Ubuntu 12.05.) – gavwould Jun 15 '12 at 22:53
  • That's my point - there's none. Same as there is no requirement for it to work as you expect. – Nikolai Fetissov Jun 16 '12 at 03:17
  • 1
    @NikolaiNFetissov: The man page is quite clear and unequivocal over what dlclose does (see my previous comment for the quote). Unreliable behaviour, if to be expected, should surely be documented. – gavwould Jun 16 '12 at 07:54

1 Answers1

6

What's happening is that there is a STB_GNU_UNIQUE symbol in libempty.so:

readelf -Ws libempty.so | grep _ZGVZN3Foo4initEvE2ns
 91: 0000000000203e80     8 OBJECT  UNIQUE DEFAULT   25 _ZGVZN3Foo4initEvE2ns
 77: 0000000000203e80     8 OBJECT  UNIQUE DEFAULT   25 _ZGVZN3Foo4initEvE2ns

The problem is that STB_GNU_UNIQUE symbols work quite un-intuitively, and persist across dlopen/dlclose calls.

The use of that symbol forces glibc to mark your library as non-unloadable here.

There are other surprises with GNU_UNIQUE symbols as well. If you use sufficiently recent gold linker, you can disable the GNU_UNIQUE with --no-gnu-unique flag.

Employed Russian
  • 199,314
  • 34
  • 295
  • 362
  • What is the purpose of such a symbol and how does it occur? Do you have to put something special in your source to produce it? – Joseph Garvin Jul 21 '20 at 21:09
  • @JosephGarvin The thread referenced under "other surprises" above has this pointer in it: https://sourceware.org/legacy-ml/libc-alpha/2011-10/msg00072.html. It's about template static data members and inline function local statics. – Employed Russian Jul 21 '20 at 22:33