2

What if I want these externals to be resolved in runtime with dlopen?

Im trying to understand why including an h file, with shared library external vars and funcs, to a C executable program results in undefined/unresolved. (when linking)

Why do I have to add "-lsomelib" flag to the gcc linkage if I only want these symbols to be resolved in runtime.

What does the link time linker need these deffinitions resolutions for. Why cant it wait for the resolution in runtime when using dlopen.

Can anyone help me understand this?

Yair Karmy
  • 31
  • 5
  • 1
    Read [Drepper's paper: how to write shared libraries](http://www.akkadia.org/drepper/dsohowto.pdf) – Basile Starynkevitch Apr 01 '14 at 20:30
  • My problem is less about shared libs and more about resolving externals from an h file. Why do I allways have to provide symbol definitions to gcc (using "-lthelib") if sometimes I only want to use dynamic linking and anyhow the symbols are resolved only in runtime? – Yair Karmy Apr 01 '14 at 20:48
  • I began reading this 47 pages paper about shared libs. It is interesting and probably answers many of my questions about the process. but meanwhile... :) – Yair Karmy Apr 01 '14 at 20:49

2 Answers2

2

Here something that may help understanding: there are 3 types of linking:

  • static linking (.a): the compiler includes the content of the library into your code at link time so that you can move the code to other computers with the same architecture and run it.
  • dynamic linking (.so): the compiler resolves the symbols at link time (during compilation); but the does not includes the code of the library in your executable. When the program is started, the library is loaded. And if the library is not found the program stop. You need the library on the computer that is running the program
  • dynamic loading: You are in charge of loading the library functions at runtime, using dlopen and etc. Specially used for plugins

see also: http://www.ibm.com/developerworks/library/l-dynamic-libraries/ and Difference between shared objects (.so), static libraries (.a), and DLL's (.so)?

Community
  • 1
  • 1
innoSPG
  • 4,588
  • 1
  • 29
  • 42
  • Yes but what Im asking is: why dynamic linking/loading need any resolving at all in the compilation/linkage process and not only on runtime on the running machine. What does the linker do that needs any definitions/resolutions of symbols, if anyhow this is done on runtime with the shared libs on the running machine? – Yair Karmy Apr 01 '14 at 20:12
  • 1
    dynamic linking needs to make sure that functions you're are calling exist somewhere because the system will be doing everything for you. That is while they need to be resolved. For dynamic loading, I have never use it, I can not really help. But from my understanding, I thing that you dont need to resolve at linking time – innoSPG Apr 01 '14 at 20:25
  • Also from one of the link I provided (http://www.ibm.com/developerworks/library/l-dynamic-libraries/) for dynamic loading in linux systems, you link only against the system library that is used to open the lib and loads functions at runtime. – innoSPG Apr 01 '14 at 20:33
  • that is ok. but why do I have to resolve the externals in the h file I include. if I dont add "-lhfileSymbolsResolutionLib" the gcc action results in errors!!! – Yair Karmy Apr 01 '14 at 20:41
2

A header file (e.g. an *.h file referenced by some #include directive) is relevant to the C or C++ compiler. The linker does not know about source files (which are input to the compiler), but only about object files produced by the assembler (in executable and linkable format, i.e. ELF)

A library file (give by -lfoo) is relevant only at link time. The compiler does not know about libraries.

The dynamic linker needs to know which libraries should be linked. At runtime it does symbol resolution (against a fixed & known set of shared libraries). The dynamic linker won't try linking all the possible shared libraries present on your system (because it has too many shared objects, or because it may have several conflicting versions of a given library), it will link only a fixed set of libraries provided inside the executable. Use objdump(1) & readelf(1) & nm(1) to explore ELF object files and executables, and ldd(1) to understand shared libraries dependencies.

Notice that the g++ program is used both for compilation and for linking. (actually it is a driver program: it starts some cc1plus -the C++ compiler proper- to compile a C++ code to an assembly file, some as -the assembler- to assemble an assembly file into an object file, and some ld -the linker- to link object files and libraries).

Run g++ as g++ -v to understand what it is doing, i.e. what program[s] is it running.

If you don't link the required libraries, at link time, some references remain unresolved (because some object files contain an external reference and relocation).

(things are slightly more complex with link-time optimization, which we could ignore)

Read also Program Library HowTo, Levine's book linkers and loaders, and Drepper's paper: how to write shared libraries

If you use dynamic loading at runtime (by using dlopen(3) on some plugin), you need to know the type and signature of relevant functions (returned by dlsym(3)). A program loading plugins always have its specific plugin conventions. For examples look at the conventions used for geany plugins & GCC plugins (see also these slides about GCC plugins).

In practice, if you are developing your application accepting some plugins, you will define a set of names, their expected type, signature, and role. e.g.

 typedef void plugin_start_function_t (const char*);
 typedef int plugin_more_function_t (int, double);

then declare e.g. some variables (or fields in a data structure) to point to them with a naming convention

 plugin_start_function_t* plustart; // app_plugin_start in plugins
 #define NAME_plustart "app_plugin_start"
 plugin_more_function_t* plumore;   // app_plugin_more in plugins
 #define NAME_plumore "app_plugin_more"

Then load the plugin and set these pointers, e.g.

 void* plugdlh = dlopen(plugin_path, RTLD_NOW);
 if (!plugdlh) { 
    fprintf(stderr, "failed to load %s: %s\n", plugin_path, dlerror()); 
    exit(EXIT_FAILURE; }

then retrieve the symbols:

 plustart = dlsym(plugdlh, NAME_plustart);
 if (!plustart) {
    fprintf(stderr, "failed to find %s in %s: %s\n", 
            NAME_plustart, plugin_path, dlerror();
    exit(EXIT_FAILURE);
 }
 plumore = dlsym(plugdlh, NAME_plumore);
 if (!plumore) {
    fprintf(stderr, "failed to find %s in %s: %s\n", 
            NAME_plumore, plugin_path, dlerror();
    exit(EXIT_FAILURE);
 }

Then use appropriately the plustart and plumore function pointers.

In your plugin, you need to code

extern "C" void app_plugin_start(const char*);
extern "C" int app_plugin_more (int, double);

and give a definition to both of them. The plugin should be compiled as position independent code, e.g. with

 g++ -Wall -fPIC -O -g pluginsrc1.c -o pluginsrc1.pic.o
 g++ -Wall -fPIC -O -g pluginsrc2.c -o pluginsrc2.pic.o

and linked with

 g++ -shared pluginsrc1.pic.o pluginsrc2.pic.o -o yourplugin.so

You may want to link extra shared libraries to your plugin.

You generally should link your main program (the one loading plugins) with the -rdynamic link flag (because you want some symbols of your main program to be visible to your plugins).

Read also the C++ dlopen mini howto

Basile Starynkevitch
  • 223,805
  • 18
  • 296
  • 547
  • thanks but a .h file introduces external global variables and functions which later (in linkage) create "unresolved/undefined" errors – Yair Karmy Apr 01 '14 at 20:55
  • I probably miss something here. you say "If you don't link the required libraries, at link time, some references remain unresolved" and I ask: so what if they stay unresolved? Isnt that what dynamic linking is all about? staying unresolved untill runtime? – Yair Karmy Apr 01 '14 at 21:02
  • No, the runtime linking is against a *known* set of libraries, and you need to give them to `g++` – Basile Starynkevitch Apr 01 '14 at 21:12
  • dlopen can open any library I give it. known or unknown. So it still is unclear why knowledge about libs used later with dlopen need to be known to the gcc/g++. If I dont include an h file with the funcs and global vars than I do not need any -l option added because nothing has to be resolved untill dlopen. – Yair Karmy Apr 01 '14 at 21:25
  • 1
    No, `dlopen` is opening one *fixed* and *known* library, as given in its first argument. Follow the links I gave you. – Basile Starynkevitch Apr 01 '14 at 21:26
  • the problem with not including the h file is that when running the dlopen and dlsym how will I use the functions if I dont have types and some unknown globals which these dlsym produced functions use – Yair Karmy Apr 01 '14 at 21:27
  • I explained a lot more. – Basile Starynkevitch Apr 01 '14 at 21:55
  • Thank you very very much. Your thorough answers are great. It still stays ambiguous to me if h file of the plugin types and vars should be included in the main program or not. and if not how are the funcs retrieved from dlsym used if they use global vars that are not resolved (nobody dlsymmed them) – Yair Karmy Apr 01 '14 at 22:03
  • Your plugin should usually avoid using global variables from the main program (at least for performance reasons). In practice, only pass *function names* (not variable names) to `dlsym` – Basile Starynkevitch Apr 01 '14 at 22:04