40

Say at runtime, I want to find out where a function "printf" is defined. How would I do this? My first attempt was to print out the address of "printf" and compare it against the virtual address mapping of the process:

my program:

#include <stdio.h>
#include <unistd.h>

void main()
{
    printf("address of printf is 0x%X\n", printf);
    printf("pid is  %d\n", getpid());
    while (1);
}

output:

-bash-4.1$ ./a &
[1] 28837
-bash-4.1$ address of printf is 0x4003F8
pid is  28837

However, this says the function is defined in my own program!

-bash-4.1$ head /proc/28837/maps 
00400000-00401000 r-xp 00000000 08:06 6946857                            /data2/temp/del/a      <<<<<<< Address 0x4003F8 is in my own program?
00600000-00601000 rw-p 00000000 08:06 6946857                            /data2/temp/del/a
397ec00000-397ec20000 r-xp 00000000 08:11 55837039                       /lib64/ld-2.12.so
397ee1f000-397ee20000 r--p 0001f000 08:11 55837039                       /lib64/ld-2.12.so
397ee20000-397ee21000 rw-p 00020000 08:11 55837039                       /lib64/ld-2.12.so
397ee21000-397ee22000 rw-p 00000000 00:00 0 
397f000000-397f18a000 r-xp 00000000 08:11 55837204                       /lib64/libc-2.12.so
397f18a000-397f38a000 ---p 0018a000 08:11 55837204                       /lib64/libc-2.12.so
397f38a000-397f38e000 r--p 0018a000 08:11 55837204                       /lib64/libc-2.12.so
397f38e000-397f38f000 rw-p 0018e000 08:11 55837204                       /lib64/libc-2.12.so

Shouldnt it be a call into libc? How do I find out where this "printf" or any other function came from?

Toby Speight
  • 27,591
  • 48
  • 66
  • 103
Sush
  • 1,169
  • 1
  • 10
  • 26
  • You can find which library is required by looking at the man page for the function. – Weather Vane Oct 01 '18 at 21:38
  • 4
    haha. At runtime, how would I find it? Please note that 'printf' is only a simplistic example. – Sush Oct 01 '18 at 21:42
  • 1
    Such as [printf](http://man7.org/linux/man-pages/man3/printf.3.html). Why is there any difference at run time? if you don't know at compile time, it won't compile/link. You do know, so why establish that at run time? – Weather Vane Oct 01 '18 at 21:42
  • 4
    pseudocode `system("man %s | grep \.h")` (just kidding) – ti7 Oct 01 '18 at 21:43
  • 3
    What you might have found with your address taking is a stub that the linker uses to connect the call in your program with the implementation in the library. Such stubs may be useful for stuff like relocations, weak symbols and such. I don't know all the different cases. But the stub itself is generally little more than just as simple branch instruction that redirects program flow towards its actual destination. – cmaster - reinstate monica Oct 01 '18 at 21:54
  • 5
    @ti7 (and others). Let's try not to confuse libraries with headers. – rici Oct 01 '18 at 21:57
  • That said, probably the best way to find a function's library would be to open the libraries object file, list its symbols, and check whether the library provides the symbol you are looking for. Unfortunately, that means that you basically would need to write the first stages of a dynamic loader to get all the rules about precedence right. So, sorry, no easy, go-to approach (at least I don't know one). – cmaster - reinstate monica Oct 01 '18 at 21:57
  • 5
    @weather: where does that manpage say that printf is in libc.so? – rici Oct 01 '18 at 21:58
  • @rici is that a dynamically linked library? A statically linked function is no longer in a library, it is part of the code. – Weather Vane Oct 01 '18 at 22:01
  • I think the problem may be the format conversion `%X`, which assumes the argument to printf is `int`. Since you are testing on x86_64, it should be `%lX`. I get a better answer, do you? – Michael Miller Oct 01 '18 at 22:03
  • @MikeMiller it is same wrong. The %p for the pointers – 0___________ Oct 01 '18 at 22:06
  • @weathervane: Yes, that's the ELF equivalent of a DLL – rici Oct 01 '18 at 22:31
  • 1
    @MikeMiller Interesting thought. %x, %p and %lX give me the same values – Sush Oct 01 '18 at 22:45
  • simply enter: `man printf` The resulting/displayed MAN page will tell you what header file contains the function prototype and the details of its' usage. If you want to know all the functions in a specific library (or what library contains the function, try: [ref](https://en.wikipedia.org/wiki/C_standard_library) – user3629249 Oct 02 '18 at 20:49

6 Answers6

32

The address you observe is located in the Procedure Linkage Table (PLT). This mechanism is used, when the location of an external (dynamically linked) symbol is not known at the time, when your binary is compiled and linked.

The purpose is, that the external linkage happens only at one place, the PLT, and not on all places throughout your code where a call to the symbol happens. So, if printf() is called, the way is:

main -> printf@PLT -> printf@libc

At runtime, you cannot find out easily in which external library the function you call is located; you would have to parse the opcodes at the destination (the PLT), which usually fetches the address from the .dynamic section and jumps there, then look, where the symbol is really located and finally, parse /proc/pid/maps to get the external library.

gsamaras
  • 71,951
  • 46
  • 188
  • 305
Ctx
  • 18,090
  • 24
  • 36
  • 51
26

At runtime, you can use gdb for this:

(terminal 1)$ ./a
pid is  16614
address of printf is 0x400450

(terminal 2)$ gdb -p 16614
(...)
Attaching to process 16614
(...)
0x00000000004005a4 in main ()
(gdb)

(gdb) info sym printf
printf in section .text of /lib/x86_64-linux-gnu/libc.so.6

If you don't want to interrupt your program or are reluctant to use gdb, you may also ask ld.so to output some debugging info:

(terminal 1)$ LD_DEBUG=bindings LD_DEBUG_OUTPUT=syms ./a
pid is  17180
address of printf is 0x400450

(terminal 2)$ fgrep printf syms.17180
    17180:  binding file ./a [0] to /lib/x86_64-linux-gnu/libc.so.6 [0]: normal symbol `printf' [GLIBC_2.2.5]
xhienne
  • 5,738
  • 1
  • 15
  • 34
10
  1. pointers are printfed using %p, not %X:

    printf("address of printf is 0x%p\n", printf);
    
  2. If you compile against static libc printf will be linked into your binary

  3. when compiled with

    gcc -fPIC a.c # (older gccs)
    ...
    gcc -fno-plt a.c # (gcc 6 and above)
    

    outputs:

    address of printf is 0x0x7f40acb522a0
    

    which is inside of

    7f40acaff000-7f40accc2000 r-xp 00000000 fd:00 100687388                  /usr/lib64/libc-2.17.so
    

Read What does @plt mean here? to find more about this.

Konrad Rudolph
  • 530,221
  • 131
  • 937
  • 1,214
fukanchik
  • 2,811
  • 24
  • 29
  • Unfortunately, I cant use fPIC, this is a huge project at work where I cant change the build procedure. – Sush Oct 05 '18 at 16:41
9

Say at runtime, I want to find out where a function "printf" is defined.

In general and absolute terms, you probably cannot (at least not easily). A given function might be defined in several libraries (for printf, that is unlikely; since it is inside the C standard library).

If you build your Linux system from scratch, you could dream of something processing every library at build time (for instance, when building every shared library, you could get all its public names with nm(1) and put them in some database). This is not really done yet today, but some research projects are going in that direction (notably softwareheritage, and other ones in 2019).

BTW, you could have several libraries defining printf. For example, if you install both GNU glibc and musl-libc on your computer (or more likely, if you have several variants of glibc). A particular program is unlikely to use both (but could still, in theory, dlopen both of them).

Maybe you want the Linux specific dladdr(3) function. From some given address, it tells you the shared object having it.

the function is defined in my own program

Yes. Read much more about dynamic linking. In particular, read Drepper's How to Write Shared Libraries paper. Understand what is the purpose of procedure linkage table.

Basile Starynkevitch
  • 223,805
  • 18
  • 296
  • 547
  • Regarding research projects scanning an entire Linux system, I have participated in proposing the H2020 DECODER project (for ICT-16 call). We got the funding, and that project will start in 2019. So stay tuned! (but we won't scan an entire Linux distro, only several libraries, and probably not `libc`) – Basile Starynkevitch Oct 02 '18 at 05:13
  • 1
    I ran into this bug described here http://man7.org/linux/man-pages/man3/dladdr.3.html "Sometimes, the function pointers you pass to dladdr() may surprise you. On some architectures (notably i386 and x86-64), dli_fname and dli_fbase may end up pointing back at the object from which you called dladdr(), even if the function used as an argument should come from a dynamically linked library." – Sush Oct 05 '18 at 16:36
-1

Parse the elf file for the dynamically linked libraries needed. Then you can parse them searching for the required symbol

0___________
  • 60,014
  • 4
  • 34
  • 74
-1

You can deduce this statically. No need to execute:

$ readelf -Ws a.out | grep printf
      1: 0000000000000000     0 FUNC    GLOBAL DEFAULT  UND printf@GLIBC_2.2.5 (2)
     51: 0000000000000000     0 FUNC    GLOBAL DEFAULT  UND printf@@GLIBC_2.2.5
KevinZ
  • 3,036
  • 1
  • 18
  • 26
  • 1
    Well, the original question is about **any** function, neither `printf` (it was an example) nor any other glic function specifically. Your command wouldn't work in the general case. What it shows is a mere version label which appears to be `GLIBC_2.2.5` but could as well have been `V_2.2.5`. Since OP says "at runtime", you cannot deduce anything statically and `readelf` is not the proper tool for the job. – xhienne Oct 03 '18 at 14:16