11

How can I get a list of all the dynamic libraries that is required by an elf binary in linux using C++?

Once I've managed to extract the information (filename?) from the binary I can find the actual file by searching through the PATH, but I haven't been able to find any information regarding extracting unmangled information from the ELF binary.

Thoughts?

John Smith
  • 133
  • 1
  • 1
  • 6
  • If you're on a RHEL-based distro (with rpm in the basement of package management), you can try [this script](http://vitalyisaev2.blogspot.com/2014/02/how-to-find-out-which-of-installed-rpm.html) in order to resolve the dependencies of your binary not only to *.so files but to the packages that provide them too. – Vitaly Isaev Mar 24 '14 at 15:25
  • 1
    Why do you ask? Do you care about indirect dependencies (ie executable `foo` dynamically linking `libbar.so` which itself is dynamically linking `libgee.so`, so `ldd foo` will tell about both `libbar.so` and `libgee.so`)? – Basile Starynkevitch Mar 24 '14 at 18:01
  • I ask because I'm using a static analysis tool and I need to extract the CFG from the targeted binary as well as any dynamic libraries that it depends on. – John Smith Mar 25 '14 at 09:43
  • Direct with `readelf -d` http://stackoverflow.com/questions/6242761/how-do-i-find-the-direct-shared-object-dependencies-of-a-linux-elf-binary , indirect with `ldd`: http://unix.stackexchange.com/questions/120015/how-to-find-out-the-dynamic-libraries-executables-loads-when-run – Ciro Santilli OurBigBook.com Aug 04 '15 at 10:02
  • http://stackoverflow.com/questions/1172649/how-to-know-which-dynamic-libraries-are-needed-by-an-elf?rq=1 – Ciro Santilli OurBigBook.com Aug 04 '15 at 10:39
  • On linux at least, you can use dl_iterate_phdr to iterate through the dynamically loaded program headers of the calling program, which includes the binary itself and the loaded dynamic libraries. – Johannes Schaub - litb Dec 28 '15 at 18:52
  • 2
    Possible duplicate of [Show all libraries used by executables on linux](http://stackoverflow.com/questions/50159/show-all-libraries-used-by-executables-on-linux) – Ciro Santilli OurBigBook.com Mar 26 '17 at 08:57
  • @JohnSmith I know it's an old question but take a look at the code I left in my answer. – Anastasios Andronidis Jun 28 '20 at 01:57

3 Answers3

13

You can call "readelf -d" program and parse the output:

readelf -d /usr/bin/readelf | grep NEEDED
 0x0000000000000001 (NEEDED)             Shared library: [libz.so.1]
 0x0000000000000001 (NEEDED)             Shared library: [libc.so.6]
Vadzim
  • 176
  • 3
12

The list of required shared objects is stored in the so-called dynamic section of the executable. The rough algorithm of getting the necessary info would be something like this:

  1. Parse the ELF header, check that the file is a dynamic executable (ET_EXEC or ET_DYN).
  2. Get the offset and count of the program headers (e_phoff/e_phnum/e_phentsize), check that they're non-zero and valid.
  3. parse the program headers, looking for the PT_DYNAMIC one. Also remember virtual address -> file offset mappings for the PT_LOAD segments.
  4. Once found, parse the dynamic section. Look for the DT_NEEDED and DT_STRTAB entries.

The d_val field of the DT_NEEDED entries is the offset into the DT_STRTAB's string table, which will be the SONAME of the required libraries. Note that since DT_STRTAB entry is the run-time address and not the offset of the string table, you'll need to map it back to a file offset using information stored at step 3.

Igor Skochinsky
  • 24,629
  • 2
  • 72
  • 109
  • Thank you, I thought this would be a fairly common problem, accessing information from a binary that is. Is there any library, open source project or similar that can provide this feature without implementing, and maintaining it, on my own? – John Smith Mar 24 '14 at 15:31
  • 4
    AFAIK most people don't need it in their *programs* so they rely on `readelf`, `objdump` or `ldd` for scripting. For programmatic access here's `libelf` but it does not offer a ready-to-use API for this specific task - you'd still have to parse the dynamic section manually. – Igor Skochinsky Mar 24 '14 at 15:45
0

You can use libelf to do this. Notice that libelf has a C API.

From their tutorial here, look at the example in section 4.2 (or here) on how to get the Program Header Table. Find the DT_DYNAMIC section and read the dependences from the string tables like the example in section 5.4 (or here).

-- EDIT --

I actually had the chance to write the code. Here is what I've done:

#include <assert.h>
#include <fcntl.h>
#include <gelf.h>
#include <stdio.h>
#include <unistd.h>

void print_dt_needed(const char *elf_path) {
  assert(elf_version(EV_CURRENT) != EV_NONE);

  int fd = open(elf_path, O_RDWR, 0);
  assert(fd >= 0);

  Elf *elf = elf_begin(fd, ELF_C_READ, NULL);
  assert(elf != NULL);
  assert(elf_kind(elf) == ELF_K_ELF);

  Elf_Scn *scn = NULL;
  while ((scn = elf_nextscn(elf, scn)) != NULL) {
    GElf_Shdr shdr = {};
    assert(gelf_getshdr(scn, &shdr) == &shdr);

    if (shdr.sh_type == SHT_DYNAMIC) {
      Elf_Data *data = NULL;
      data = elf_getdata(scn, data);
      assert(data != NULL);

      size_t sh_entsize = gelf_fsize(elf, ELF_T_DYN, 1, EV_CURRENT);

      for (size_t i = 0; i < shdr.sh_size / sh_entsize; i++) {
        GElf_Dyn dyn = {};
        assert(gelf_getdyn(data, i, &dyn) == &dyn);

        if (dyn.d_tag == DT_NEEDED) {
          printf("DT_NEEDED detected: %s\n",
                 elf_strptr(elf, shdr.sh_link, dyn.d_un.d_val));
        }
      }
    }
  }
  assert(elf_end(elf) == 0);
  assert(close(fd) == 0);
}

int main(int argc, char const *argv[]) {
  print_dt_needed(argv[1]);
  return 0;
}
Anastasios Andronidis
  • 6,310
  • 4
  • 30
  • 53