0

I am trying to list all real file dependencies of an ELF executable in order to improve granularity of incremental building/testing.

When I link an executable against a set of libraries, the symbols from the STATIC ones appear on top of the linker map, which is good. I would like to also know when the linker include a symbol from a shared library and the path to the defining file.

For exemple if I have an executable looking like:

#include "ext_src.h"
#include "ext_lib.h"
#include "int_src.h"

int main(){
    ext_src();
    ext_lib();
    int_src();
}

Where each of the 3 functions comes from a different library, the compiler command being:

/usr/bin/cc -O3 -DNDEBUG -Wl,-Map=exec2.map exec2.c.o -o exec2  -Wl,-rpath,backend_libs: backend_libs/libsub.so backend_libs/libEXT_dependence1.so extern/lib/an_extern_lib/libext_lib_normal.a

I can only have the information I seek (which is that ext_lib() comes from ext_lib.c.o) for the static library on top of the linker map:

Membre d'archive inclu pour satisfaire la référence par fichier (symbole)

extern/lib/an_extern_lib/libext_lib_normal.a(ext_lib.c.o)
                              CMakeFiles/exec2.dir/entry/exec2.c.o (ext_lib)
/usr/lib/x86_64-linux-gnu/libc_nonshared.a(elf-init.oS)
                              /usr/lib/gcc/x86_64-linux-gnu/6/../../../x86_64-linux-gnu/Scrt1.o (__libc_csu_init)

The information does not seem to be anywhere in the linker map. Indeed I cant find the module I know where ext_src() is defined in it.

Does someone have an idea how to get the file from which ext_src is defined? It need to be in a way that it would list only the symbols that my executable actually uses

Edit: I also forgot to mention that I control the compilation of the libraries I link to. Thus I am open to a solution involving compiling theses libraries with weird flags, debug sections...

ninjaconcombre
  • 456
  • 4
  • 15
  • You may need to write a [perl/python] script for this. You can use (e.g.) `nm` [and/or `readelf`] on your `.o` and the `.so` and/or `.a` files to see what symbols they reference and/or define. You can then cross reference them. See my answer: https://stackoverflow.com/questions/34164594/gcc-ld-method-to-determine-link-order-of-static-libraries/34168951#34168951 – Craig Estey Dec 22 '20 at 18:22
  • Hi @CraigEstey. Thanks for your comment. Well I am really not hyped about doing this haha. I would really rather avoid to re simulate the linker job like that. Is there no way to make the linker output me "hey, I took this symbol from this file" when he does it on the moment? It look like such a slow solution to redo every thing parsing others process outputs for each module. And it's sound not easy either. – ninjaconcombre Dec 22 '20 at 18:59
  • [Before I commented], I looked at the manpage for `gcc` and `ld`. I thought gcc's `-v` option might do it. I just did a google search on [all the words]: `linker symbol cross reference`. After a few pages, I found one that talked about the "map file". Then, I looked at the `ld` manpage again. The `--cref` option might help. But, you'll still probably have to write that script [albeit a simpler one]. – Craig Estey Dec 22 '20 at 19:18
  • I just tried the cref option, that looked promising. But again there is no trace of ext_src/int_src symbol in the result map. – ninjaconcombre Dec 22 '20 at 23:29
  • I just played with `--cref`. It does _not_ show _references_ from one file to another, only which files define which symbols. Therefore, it _does_ list the various files that define symbols, so you can get that. But, I'm afraid, you'll need to parse `readelf -s ` and build up the cross-reference. This may _seem_ like a lot of work, but, in my experience having such a tool is useful and _reusable_. I have scripts that I've written that I keep in my "toolbox", so "write once, use many". It's not _that_ much code, particularly if you use `perl/python/javascript`. – Craig Estey Dec 23 '20 at 02:39
  • But, this brings up an issue. If you have a list of the library files (gotten from `--cref`, e.g.)., you can create dependencies for `make` from that: `myexec: myexec.o mysub.o lib1file, lib2file, ...` Whether to rebuild or not. I don't think your build will go any faster [or more incremental] than that. The reason is that _anything_ that changes a `.o` [or `.a`] either symbol or code or data _has_ to force a relink. _Which_ symbol doesn't matter. Only the modification times of the files [which `make` already handles]. – Craig Estey Dec 23 '20 at 02:47
  • I know there are some IDEs for _other_ [interpreted] languages that can do incremental rebuild of a given source and _dynamically_ reload a portion into a running instance of a program, but `c` isn't one of them. Again, even changing (e.g.) `int x = 1;` into `int x = 2;` needs a [full] relink. That did _not_ change a symbol, just some code/data, so the granularity is "per file" and _not_ "per symbol". Even if you did your own dynamic thing (with `dlopen` et. al.), it's still a per file issue – Craig Estey Dec 23 '20 at 02:53
  • Did you try cref linking to a shared library? On my build linking to the 2 shared library with this gcc command, the ext_src and int_src are not appareaing in the linker map. Maybe there is some kind of bug and the linker map override the output for the first library, then again, then again I just end up with the last one? If you get to see te shared symbols in your link map, I would like to see your gcc command and try to copy it. – ninjaconcombre Dec 23 '20 at 09:51
  • Then on the granularity aspect: The use case is when you dev a project with a big library. You have some tests that link to it. The build system is going to put a dependencies on the library file himself. Instead, I want to automatically get the real dependencies. Which I define as .o files that define symbols in the executable or included headers. It matter for me is because I made my test incremental as well (test only rebuilded tests). However today it's useless because any little change in the library make every unit test rebuild -> re-test. – ninjaconcombre Dec 23 '20 at 09:58
  • I stopped linking the static library, it still did not output anything about int_src/ext_src, thus showing that the overriding bug I was talking about is not an issue. The linker map just seem to not include any information about symbols from shared library. – ninjaconcombre Dec 23 '20 at 11:30
  • I think the answer is indeed that linker map ignore shared symbol. Shared library are not working like a static library, the latter are easier to get the source file from a symbol by design. To achieve what I want without too much computation whit shared ones, I should probably parse all dynamic symbols with nm -D. Then I need to create a process to retrieve from a shared library with debug infos the source file of a symbol. I will explore this direction for now. – ninjaconcombre Dec 23 '20 at 12:52

0 Answers0