You're not building with symbol-interposition enabled (a side-effect of -fPIC
), so the call
destination address can potentially be resolved at link time to an address in another object file that is being statically linked into the same executable. (e.g. gcc foo.o bar.o
).
However, if the symbol is only found in a library that you're dynamically linking to (gcc foo.o -lbar
), the call
has to be indirected through the PLT to support.
Now this is the tricky part: without -fPIC
or -fPIE
, gcc still emits asm that calls the function directly:
int puts(const char*); // puts exists in libc, so we can link this example
void call_puts(void) { puts("foo"); }
# gcc 5.3 -O3 (without -fPIC)
movl $.LC0, %edi # absolute 32bit addressing: slightly smaller code, because static data is known to be in the low 2GB, in the default "small" code model
jmp puts # tail-call optimization. Same as call puts/ret, except for stack alignment
But if you look at the linked binary:
(on this Godbolt compiler explorer link, click the "binary" button to toggle between gcc -S
asm output and objdump -dr
disassembly)
# disassembled linker output
mov $0x400654,%edi
jmpq 400490 <puts@plt>
During linking, the call to puts
was "magically" replaced with indirection through puts@plt
, and a puts@plt
definition is present in the linked executable.
I don't know the details of how this works, but it's done at link time when linking to a shared library. Crucially, it doesn't require anything in the header files to mark the function prototype as being in a shared library. You get the same results from including <stdio.h>
as you do from declaring puts
yourself. (This is highly not recommended; it's probably legal for a C implementation to only work properly with the declarations in headers. It happens to work on Linux, though.)
When compiling a position-independent executable (with -fPIE
), the linked binary jumps to puts
through the PLT, identically to without -fPIC
. However, the compiler asm output is different (try it yourself on the godbolt link above):
call_puts: # compiled with -fPIE
leaq .LC0(%rip), %rdi # RIP-relative addressing for static data
jmp puts@PLT
The compiler forces indirection through the PLT for any calls to functions it can't see the definition for. I don't understand why. In PIE mode, we're compiling code for an executable, not a shared library. The linker should be able to link multiple object files into a position-independent executable with direct calls between functions defined in the executable. I'm testing on Linux (my desktop and godbolt), not OS X, where I assume gcc -fPIE
is the default. It might be configured differently, IDK.
With -fPIC
instead of -fPIE
, things are even worse: even calls to global functions defined within the same compilation unit have to go through the PLT, to support symbol interposition. (e.g. LD_PRELOAD=intercept_some_functions.so ./a.out
)
The differences between -fPIC
and -fPIE
are mainly that PIE can assume no symbol interposition for functions in the same compilation unit, but PIC can't. OS X requires position-independent executables, as well as shared libraries, but there is a difference in what the compiler can do when making code for a library vs. making code for an executable.
This Godbolt example has some more functions that demonstrate stuff about PIC and PIE mode, e.g. that call_puts()
can't inline into another function in PIC mode, only PIE.
See also: Shared object in Linux without symbol interposition, -fno-semantic-interposition error.
puzzled by mov $0, %edi
You're looking at disassembly output from the .o
, where addresses are just placeholder 0s that will be replaced by the linker at link time, based on the relocation information in the ELF object file. That's why @Leandros suggested objdump -r
.
Similarly, the relative displacement in the call
machine code is all-zeros, because the linker hasn't filled it in yet.