I have a problem understanding, what exactly happens, when a dynamic library is loaded at runtime and how the dynamic linker recognizes and treats "same symbols".
I've read other questions related to symbolic linking and observed all the typical recommendations (using extern "C", using -fPIC when linking the library, etc.). To my knowledge, my specific problem was not discussed, so far. The paper "How to write shared libraries" https://www.akkadia.org/drepper/dsohowto.pdf does discuss the process of resolving library symbol dependencies, that may explain what's happening in my example below, but alas, it does not offer a workaround.
I found a post where the last (unfortunately) un-answered comment is very much the same as my problem:
Is there symbol conflict when loading two shared libraries with a same symbol
Only difference is: in my case the symbol is being an auto-generated constructor.
Here's the setup (Linux):
- program "master" uses some library class declaration "Dummy" with 4 members variables and loads dynamically a shared library via dlopen() and resolves two simple functions with dlsym()
- the shared library "slave" uses also the library with the class "Dummy", yet in a newer version with 5 member variables (extra string)
- when the shared library's function is called from master, accessing the newly added string member in class Dummy segfaults - apparently the string wasn't initialized correctly
My assumption is: the constructor of class Dummy exists already in memory since master uses this function itself, and when loading the shared library it does not load its own version of the constructor, but simply re-uses the existing version from master. By doing that the extra string variable is not initialized correctly in the constructor, and accessing it segfaults.
When debugging into the assembler code when initializing the Dummy variable d in the slave, indeed Dummy's constructor inside the master's memory space is being called.
Questions:
How does the dynamic linker (dlopen()?) recognize, that the class Dummy used to compile the master should be the same as Dummy compiled into Slave, despite it being provided in the library itself? Why does the symbol lookup take the master's variant of the constructor, even though the symbol table must also contain the constructor symbol imported from the library?
Is there a way, for example by passing some suitable options to dlopen() or dlsym() to enforce usage of the Slave's own Dummy constructor instead of the one from Master (i.e. tweak the symbol lookup/reallocation behavior)?
Code: full minimalistic source code example can be found here:
https://bauklimatik-dresden.de/privat/nicolai/tmp/master-slave-test.tar.bz2
Relevant shared lib loading code in Master:
#include <iostream>
#include <dlfcn.h> // shared library loading on Unix systems
#include "Dummy.h"
int create(void * &data);
typedef int F_create(void * &data);
int destroy(void * data);
typedef int F_destroy(void * data);
int main() {
// use dummy class at least once in program to create constructor
Dummy d;
d.m_c = "Test";
// now load dynamic library
void *soHandle = dlopen( "libSlave.so", RTLD_LAZY );
std::cout << "Library handle 'libSlave.so': " << soHandle << std::endl;
if (soHandle == nullptr)
return 1;
// now load constructor and destructor functions
F_create * createFn = reinterpret_cast<F_create*>(dlsym( soHandle, "create" ) );
F_destroy * destroyFn = reinterpret_cast<F_destroy*>(dlsym( soHandle, "destroy" ) );
void * data;
createFn(data);
destroyFn(data);
return 0;
}
Class Dummy: the variant without "EXTRA_STRING" is used in Master, with extra string is used in Slave
#ifndef DUMMY_H
#define DUMMY_H
#include <string>
#define EXTRA_STRING
class Dummy {
public:
double m_a;
int m_b;
std::string m_c;
#ifdef EXTRA_STRING
std::string m_c2;
#endif // EXTRA_STRING
double m_d;
};
#endif // DUMMY_H
Note: if I use exaktly same class Dummy both in Master and Slave, the code works (as expected).