The void *
pointer returned by dlopen(0, RTLD_LAZY)
gives you a struct link_map *
, that corresponds to the main executable.
Calling dl_iterate_phdr
also returns the entry for the main executable on the very first execution of callback.
You are likely confused by the fact that .l_addr == 0
in the link map, and that dlpi_addr == 0
when using dl_iterate_phdr
.
This is happening, because l_addr
(and dlpi_addr
) don't actually record the load address of an ELF image. Rather, they record the relocation that has been applied to that image.
Usually the main executable is built to load at 0x400000
(for x86_64 Linux) or at 0x08048000
(for ix86 Linux), and are loaded at that same address (i.e. they are not relocated).
But if you link your executable with -pie
flag, then it will be linked-at 0x0
, and it will be relocated to some other address.
So how do you get to the ELF header?
2023 Update:
Isn't a simpler method (if relying on undocumented details), just to call dladdr
on the l_ld
address in the struct link_map
, and then use dli_fbase
out of that? – Simon Kissane
Indeed it is. Here is much simpler solution:
#define _GNU_SOURCE
#include <dlfcn.h>
#include <link.h>
#include <stdio.h>
int main()
{
void *dyn = _DYNAMIC;
Dl_info info;
if (dladdr(dyn, &info) != 0) {
printf("a.out loaded at %p\n", info.dli_fbase);
}
return 0;
}
gcc -g -Wall -Wextra x.c -ldl && ./a.out
a.out loaded at 0x556433ea0000 # high address here because my GCC defaults to PIE.
gcc -g -Wall -Wextra x.c -ldl -no-pie && ./a.out
a.out loaded at 0x400000
gcc -g -Wall -Wextra x.c -ldl -no-pie -m32 && ./a.out
a.out loaded at 0x8048000
Original 2012 solution:
#ifndef _GNU_SOURCE
#define _GNU_SOURCE
#endif
#include <link.h>
#include <stdio.h>
#include <stdlib.h>
static int
callback(struct dl_phdr_info *info, size_t size, void *data)
{
int j;
static int once = 0;
if (once) return 0;
once = 1;
printf("relocation: 0x%lx\n", (long)info->dlpi_addr);
for (j = 0; j < info->dlpi_phnum; j++) {
if (info->dlpi_phdr[j].p_type == PT_LOAD) {
printf("a.out loaded at %p\n",
(void *) (info->dlpi_addr + info->dlpi_phdr[j].p_vaddr));
break;
}
}
return 0;
}
int
main(int argc, char *argv[])
{
dl_iterate_phdr(callback, NULL);
exit(EXIT_SUCCESS);
}
$ gcc -m32 t.c && ./a.out
relocation: 0x0
a.out loaded at 0x8048000
$ gcc -m64 t.c && ./a.out
relocation: 0x0
a.out loaded at 0x400000
$ gcc -m32 -pie -fPIC t.c && ./a.out
relocation: 0xf7789000
a.out loaded at 0xf7789000
$ gcc -m64 -pie -fPIC t.c && ./a.out
relocation: 0x7f3824964000
a.out loaded at 0x7f3824964000
Update:
Why does the man page say "base address" and not relocation?
It's a bug ;-)
I am guessing that the man page was written long before prelink
and pie
, and ASLR
existed. Without prelink, shared libraries are always linked to load at address 0x0
, and then relocation
and base address
become one and the same.
how come dlpi_name points to an empty string when info refers to the main executable?
It's an accident of implementation.
The way this works, is that the kernel open(2)
s the executable and passes the open file descriptor to the loader (in the auxv[]
vector, as AT_EXECFD
). Everything the loader knows about the executable it gets by reading that file descriptor.
There is no easy way on UNIX to map a file descriptor back to the name it was opened as. For one thing, UNIX supports hard-links, and there could be multiple filenames that refer to the same file.
Newer Linux kernels also pass in the name that was used to execve(2)
the executable (also in auxv[]
, as AT_EXECFN
). But that is optional, and even when it is passed in, glibc doesn't put it into .l_name
/ dlpi_name
in order to not break existing programs which became dependent on the name being empty.
Instead, glibc saves that name in __progname
and __progname_full
.
The loader coud readlink(2)
the name from /proc/self/exe
on systems that didn't use AT_EXECFN
, but the /proc
file system is not guaranteed to be mounted either, so that would still leave it with an empty name sometimes.