4

I'm working on a cross-platform project which consists on several libraries, dynamically loading and unloading one another depending on run-time conditions. Currently I observe a crash which seems to be caused by the fact that static objects in one of the shared libraries are destroyed before the shared library is unloaded with dlclose(). This seems pretty strange and more like a bug to me.

To investigate the problem I've created a simple project, which consists of three source files: main.cpp, lib1.cpp and lib2.cpp (for executable and two libraries respectively). The main executable dynamically loads lib1, and lib1 in turn dynamically loads lib2.

main.cpp:

Logger mainGlobal("mainGlobal");

int main(int argc, char * argv[])
{
    Logger mainFunction("mainFunction");
    try
    {
        Logger mainTry("mainTry");
        libutil::AutoLib lib("lib1");
        lib.call("loadLib2");
    }
    catch (std::exception & e)
    {
        std::cerr << "Fatal: " << e.what() << std::endl;
    }
    std::cout << "Exiting main" << std::endl;
}

lib1.cpp:

Logger lib1Global("lib1Global");

std::auto_ptr<libutil::AutoLib> lib2;

DLL_EXPORT void loadLib2()
{
    std::cout << "loadLib2" << std::endl;
    lib2.reset(new libutil::AutoLib("lib2"));
}

lib2.cpp:

Logger lib2Global("lib2Global");

Logger is a simple struct which just logs in its constructor and destructor. libutil::AutoLib is a shared library loader which calls dlopen(path, RTLD_LAZY) in its ctor, and calls dlclose() in its dtor, and allows to call functions exported from a shared library. The code for those classes is trivial, but I can post it here as well if needed.

Long story short, if I call main executable, I see the following log:

mainGlobal ctor
mainFunction ctor
mainTry ctor
Loading library lib1.so
lib1Global ctor
dlopen(lib1.so) returned 0x14cd050
Library lib1.so loaded with handle 0x14cd050
Calling loadLib2 in library 0x14cd050
loadLib2
Loading library lib2.so
lib2Global ctor
dlopen(lib2.so) returned 0x14cd710
Library lib2.so loaded with handle 0x14cd710
Unloading library 0x14cd050
Calling dlclose(0x14cd050)
Library unloaded 0x14cd050
mainTry dtor
Exiting main
mainFunction dtor
lib2Global dtor
Unloading library 0x14cd710
Calling dlclose(0x14cd710)
Library unloaded 0x14cd710
lib1Global dtor
mainGlobal dtor

Please note the lib2Global dtor line going before the Calling dlclose(0x14cd710) line.

So the question is, is it a bug or correct behavior?

There are questions here in SO about static objects not being destroyed after dlclose(), but I did not find any questions about somewhat the opposite situation.

I'm using GCC 5.4.0-6ubuntu1~16.04.10.

Alex Che
  • 6,659
  • 4
  • 44
  • 53
  • 1
    Have you used `-fuse-cxa-atexit` compiler flag? Does this answer your question: https://stackoverflow.com/questions/42912038/what-is-the-difference-between-cxa-atexit-and-atexit – Igor G Jun 28 '19 at 17:51
  • @IgorG thanks for the suggestion. I've tried to add `-fuse-cxa-atexit` with no visible difference. But after I changed it to the opposite `-fno-use-cxa-atexit` then the order of destruction changed at the end: `mainFunction dtor, mainGlobal dtor, Unloading library 0x95a710, Calling dlclose(0x95a710), Library unloaded 0x95a710, lib1Global dtor, lib2Global dtor`. So, seems like now statics of shared objects are destructed at the very end. – Alex Che Jun 28 '19 at 20:33
  • 1
    Since it’s legitimate to call `exit` from the middle of execution, it doesn’t assume that you will `dlclose` everything yourself and does so for you. But that means you can’t call it yourself from something that might run during `exit`. What I don’t understand is why `lib1.so` remains loaded after you do `dlclose` it; there may be an implicit back reference from having called `dlopen` in it. – Davis Herring Jun 29 '19 at 06:25

1 Answers1

1

Thanks Davis Herring for the hint, I found the reason. So somewhat was holding lib1.so in memory, not allowing it to unload. As it turned out, lib1.so called an inline function, which contained static const variable, and this made gcc to create STB_GNU_UNIQUE binding for this variable. Which in turn effectively made lib1.so unloadable, even though it was loaded with RTLD_LOCAL. So, to fix the issue, I could either remove the static qualifier from the variable definition, or remove inline qualifier from the function definition, or use -fno-gnu-unique g++ flag. After I did this the issue is gone:

mainGlobal ctor
mainFunction ctor
mainTry ctor
Loading library lib1.so
lib1Global ctor
dlopen(lib1.so) returned 0x1cfe050
Library lib1.so loaded with handle 0x1cfe050
Calling loadLib2 in library 0x1cfe050
loadLib2
Loading library lib2.so
lib2Global ctor
dlopen(lib2.so) returned 0x1cfe710
Library lib2.so loaded with handle 0x1cfe710
Unloading library 0x1cfe050
Calling dlclose(0x1cfe050)
Unloading library 0x1cfe710
Calling dlclose(0x1cfe710)
Library unloaded 0x1cfe710
lib1Global dtor
lib2Global dtor
Library unloaded 0x1cfe050
mainTry dtor
Exiting main
mainFunction dtor
mainGlobal dtor

Here is the excerpt from GNU GCC help, regarding this:

-fno-gnu-unique
    On systems with recent GNU assembler and C library, the C++ compiler 
    uses the STB_GNU_UNIQUE binding to make sure that definitions of template 
    static data members and static local variables in inline functions are unique
    even in the presence of RTLD_LOCAL ; this is necessary to avoid problems 
    with a library used by two different RTLD_LOCAL plugins depending on a definition
    in one of them and therefore disagreeing with the other one about the binding of 
    the symbol. But this causes dlclose to be ignored for affected DSOs; if your
    program relies on reinitialization of a DSO via dlclose and dlopen , you can use 
    -fno-gnu-unique .

Here is the question with a related issue.

Alex Che
  • 6,659
  • 4
  • 44
  • 53