6

I am having a problem about a wrong symbol resolution. My main program loads a shared library with dlopen and a symbol from it with dlsym. Both the program and the library are written in C. Library code

int a(int b)
{
  return b+1;
}

int c(int d)
{
  return a(d)+1;
}

In order to make it work on a 64-bit machine, -fPIC is passed to gcc when compiling.

The program is:

#include <dlfcn.h>
#include <stdio.h>

int (*a)(int b);
int (*c)(int d);

int main()
{
  void* lib=dlopen("./libtest.so",RTLD_LAZY);
  a=dlsym(lib,"a");
  c=dlsym(lib,"c");
  int d = c(6);
  int b = a(5);
  printf("b is %d d is %d\n",b,d);
  return 0;
}

Everything runs fine if the program is NOT compiled with -fPIC, but it crashes with a segmentation fault when the program is compiled with -fPIC. Investigation led to discover that the crash is due to the wrong resolution of symbol a. The crash occurs when a is called, no matter whether from the library or the main program (the latter is obtained by commenting out the line calling c() in the main program).

No problems occur when calling c() itself, probably because c() is not called internally by the library itself, while a() is both a function used internally by the library and an API function of the library.

A simple workaround is not use -fPIC when compiling the program. But this is not always possible, for example when the code of the main program has to be in a shared library itself. Another workaround is to rename the pointer to function a to something else. But I cannot find any real solution.

Replacing RTLD_LAZY with RTLD_NOW does not help.

user377486
  • 693
  • 2
  • 10
  • 19
  • 1
    Please show us the compile lines you used, as well as your compiler version. – robert May 26 '12 at 10:09
  • 3
    I suggest not naming a global pointer to function with the same name as the `dlsym`-ed function it points to. Or just make your pointer to functions local or static variables, or data fields. – Basile Starynkevitch May 26 '12 at 10:17
  • 1
    Thinking more about it, it seems that, since it was not otherwise specified, the main program also exports symbols a and c externally. So symbol a is doubly defined (by the main program and by the shared object) and the dynamic linker finds the wrong one. Using the gcc-specific __attribute__ ((visibility ("hidden"))) in the main program is maybe the right thing to do... any advice? – user377486 May 26 '12 at 10:18
  • gcc version 4.6.3 (Ubuntu/Linaro 4.6.3-1ubuntu5) Object files use the default Makefile rule: $(CC) $(CFLAGS) $(CPPFLAGS) $(TARGET_ARCH) -c $< To link the library gcc -shared -o $@ $^ To link the executable gcc -o $@ $^ -g -ldl CFLAGS=-g or CFLAGS='-g -fPIC' is added to the command line – user377486 May 26 '12 at 10:20
  • 1
    @user377486: The best advice is not to use the same names to begin with. It's undefined behavior. If you want the names to seem the same, you could do `int (*a_ptr)(int b);` and `#define a a_ptr` (but this seems really ugly for a name like `a`...), or you could just make it to the function pointer doesn't have external linkage. – R.. GitHub STOP HELPING ICE May 26 '12 at 12:29

3 Answers3

4

I suspect that there is a clash between two global symbols. One solution is to declare a in the main program as static. Alternatively, the linux manpage mentions RTLD_DEEPBIND flag, a linux-only extension, which you can pass to dlopen and which will cause library to prefer its own symbols over global symbols.

zvrba
  • 24,186
  • 3
  • 55
  • 65
  • 2
    Both methods work. The keyword static is not so general, because it limits the visibility to the same compilation unit, but __attribute__ ((visibility ("hidden"))) can be used if it is necessary that the function pointers are not static. – user377486 May 26 '12 at 11:13
0

It seems this issue can take place in one more case (like for me). I have a program and a couple of a dynamically linked libs. And when I tried to add one more I used a function from a static lib (my too) in it. And I forgot to add to linkage list this static lib. Linker was not warn me about this, but program was crushing with segmentation fault error.

Maybe this will help for someone.

Serge Roussak
  • 1,731
  • 1
  • 14
  • 28
0

FWIW, I ran into a similar problem when compiling as C++ and forgetting about name mangling. A solution there is to use extern "C".

Błażej Czapp
  • 2,478
  • 2
  • 24
  • 18