4

In my Linux application I am using a plugin architecture via dlopen. Shared objects are being opened with

dlopen(path, RTLD_GLOBAL | RTLD_LAZY)`

The option RTLD_GLOBAL is necessary since plugins need to access common RTTI information. Ocaisionally it happens that some plugins export the same symbols. This should not normally happen, but when it does it results in random segfaults and that is difficult to debug. So I would like detect duplicate symbols at dlopen and warn about them.

Is there a way to do this?

Here is a simple example to illustrate this. The code of the main executable is

#include <string>
#include <dlfcn.h>
#include <iostream>
#include <cassert>

typedef void (*Function)();

void open(const std::string& soname)
{
    void* so = dlopen(soname.c_str(), RTLD_LAZY | RTLD_GLOBAL);
    if (!so) {
        std::cout << dlerror() << std::endl;
    } else {
        Function function = reinterpret_cast<Function>(dlsym(so, "f"));
        assert(function);
        function();
    }
}

int main()
{
    open("./a.so");
    open("./b.so");
    return 0;
}

And it is being built by the command g++ main.cpp -o main -ldl

a.so and b.so are being built from

#include <iostream>

void g()
{
     std::cout << "a.cpp" << std::endl;
}

extern "C" {
    void f()
    {
        g();
    }
}

and

#include <iostream>

void g()
{
     std::cout << "b.cpp" << std::endl;
}

extern "C" {
    void f()
    {
        g();
    }
}

by commands g++ -fPIC a.cpp -share -o a.so and g++ -fPIC b.cpp -share -o b.so respectively. Now if I execute ./main I get

a.cpp
a.cpp

With RTLD_LOCAL I get

a.cpp
b.cpp

but as I have explained I don't wont RTLD_LOCAL.

  • 1
    Have you checked if maybe `RTLD_DEEPBIND` is an alternative solution to your problem? – PlasmaHH Jan 24 '12 at 10:17
  • @PlasmaHH That will use a function from another library (or some other function. I don't think there is a solution to this, other then not to do it – BЈовић Jan 24 '12 at 10:37
  • @VJovic: It seems to do the contrary, according to my manpage: " Place the lookup scope of the symbols in this library ahead of the global scope. This means that a self-contained library will use its own symbols in preference to global symbols with the same name contained in libraries that have already been loaded." – PlasmaHH Jan 24 '12 at 11:13
  • @PlasmaHH Sorry, somehow I misread that description. You are right - might help – BЈовић Jan 24 '12 at 11:24
  • With RTLD_DEEPBIND my above program segfaults at the first call of `function()`. I still need to figure out why. – Levon Haykazyan Jan 24 '12 at 12:14
  • may be related to this bug http://gcc.gnu.org/bugzilla/show_bug.cgi?id=42679 – Levon Haykazyan Jan 24 '12 at 12:23
  • Can you be more explicit why you cannot use RTLD_LOCAL? You *really* want to use it if possible -- for more reasons than just solving this problem. – mcmcc Jan 25 '12 at 19:26
  • RTLD_GLOBAL is needed in order for `dynamic_cast` and other rtti staff to function correctly. For example, if you have `class A` and `class B : public A` and an object created in one shared object, say `A* a = new B()`. Then `dynamic_cast(a)` will fail in another shared object. This question http://stackoverflow.com/questions/2351786/dynamic-cast-fails-when-used-with-dlopen-dlsym gives more details. And here is the answer http://gcc.gnu.org/faq.html#dso – Levon Haykazyan Jan 25 '12 at 22:40

1 Answers1

1

I would like detect duplicate symbols at dlopen and warn about them.

I don't think dlopen can do that.

Even if it could, detecting that problem at runtime is probably too late. You should be detecting that problem at build time, and it is trivial to do so as a post-build step:

nm -D your_plugin_dir/*.so | egrep ' [TD] ' | cut -d ' ' -f3 |
  sort | uniq -c | grep -v ' 1 '

If you get any output, you have duplicate symbols (some duplicate symbols may actually be ok; you'll have to filter out "known good" duplicates).

Employed Russian
  • 199,314
  • 34
  • 295
  • 362
  • Actually plugins may load other shared object and duplicate symbols may occur in those. The structure is pretty messed up and I don't want to check all shared object for duplicate symbols, as some are "allowed" to have duplicate symbols. Thanks for the tip anyway. – Levon Haykazyan Jan 30 '12 at 11:20