So I've made an ELF64 executable file with my own compiler/linker. It is very basic, only has 1 dependency for libc.puts, thus one entry in the symbol table and one relocation entry.
If I run it with the linker explicitly, it works just fine and prints the letter A:
/lib64/ld-linux-x86-64.so.2 ./o
If I run it by itself:
./o
I get a sigfault in _dl_relocate_object() at dl-reloc.c:232, which in my version of Ubuntu 16.04 is:
/* Do the actual relocation of the object's GOT and other data. */
/* String table object symbols. */
const char *strtab = (const void *) D_PTR (l, l_info[DT_STRTAB]);
Here is the output of readelf:
ELF Header:
Magic: 7f 45 4c 46 02 01 01 00 00 00 00 00 00 00 00 00
Class: ELF64
Data: 2's complement, little endian
Version: 1 (current)
OS/ABI: UNIX - System V
ABI Version: 0
Type: EXEC (Executable file)
Machine: Advanced Micro Devices X86-64
Version: 0x1
Entry point address: 0x406000
Start of program headers: 12672 (bytes into file)
Start of section headers: 0 (bytes into file)
Flags: 0x0
Size of this header: 64 (bytes)
Size of program headers: 56 (bytes)
Number of program headers: 7
Size of section headers: 0 (bytes)
Number of section headers: 0
Section header string table index: 0
There are no sections in this file.
There are no sections to group in this file.
Program Headers:
Type Offset VirtAddr PhysAddr FileSiz MemSiz Flg Align
PHDR 0x003180 0x0000000000408180 0x0000000000408180 0x000188 0x000188 R 0x8
INTERP 0x003118 0x0000000000408118 0x0000000000408118 0x00001c 0x00001c R 0x1
[Requesting program interpreter: /lib64/ld-linux-x86-64.so.2]
LOAD 0x001000 0x0000000000405000 0x0000000000405000 0x000038 0x000038 RW 0x1000
LOAD 0x002000 0x0000000000406000 0x0000000000406000 0x000038 0x000038 R E 0x1000
LOAD 0x003000 0x0000000000407000 0x0000000000407000 0x000074 0x000080 RW 0x1000
LOAD 0x003078 0x0000000000408078 0x0000000000408078 0x000290 0x000290 RW 0x1000
DYNAMIC 0x003078 0x0000000000408078 0x0000000000408078 0x000290 0x000290 RW 0x1000
Dynamic section at offset 0x3078 contains 9 entries:
Tag Type Name/Value
0x0000000000000001 (NEEDED) Shared library: [libc.so.6]
0x0000000000000005 (STRTAB) 0x408108
0x0000000000000006 (SYMTAB) 0x408138
0x000000000000000b (SYMENT) 24 (bytes)
0x0000000000000007 (RELA) 0x408168
0x0000000000000008 (RELASZ) 24 (bytes)
0x0000000000000009 (RELAENT) 24 (bytes)
0x000000000000000a (STRSZ) 16 (bytes)
0x0000000000000000 (NULL) 0x0
There are no relocations in this file.
The decoding of unwind sections for machine type Advanced Micro Devices X86-64 is not currently supported.
Dynamic symbol information is not available for displaying symbols.
No version information found in this file.
So, what's wrong with my file, and how can i get it to run without prepending the linker name on the command line?
EDIT: My ELF file requests PT_DYNAMIC to be loaded at virtual address 0x408078.
When run as
/lib64/ld-linux-x86-64.so.2 ./o
the directive is followed, and PT_DYNAMIC is loaded at 0x408078. However, when run as
./o
the PT_DYNAMIC gets loaded at 0x407078. It happens that the preceeding segment (which is a variables segment) ends at 0x407079H (starts at 0x407000 and spans 0x80 bytes), thus writing 0x0 into 0x407078 and 0x407079. These two zero bytes override the DT_NEEDED tag that the PT_DYNAMIC segment starts with. Consequently, the dynamic loader thinks that PT_DYNAMIC is empty, and cannot find any tags. In particular, it cannot find the DT_STRTAB tag, which caused the trap that I described in my question.
Interestingly, it turns out that the actual and not corrupted copy of DT_DYNAMIC is in fact present at 0x408078, as the PHT directs.
Here's how I figured this all out:
gdb ./o
break _dl_relocate_object
r
(when run with linker name prepended to command line, it first stops at the breakpoint while loading the linker itself, so I had to cont; then the 2nd time it stops is for my file)
after it stops at the breakpoint:
info registers rdi
; this is the function's first argument struct link_map* lx/8ag <rdi value>
;see what the values of link_map members are link_map is defined as
{
/* These first few members are part of the protocol with the debugger.
This is the same format used in SVR4. */
ElfW(Addr) l_addr; /* Difference between the address in the ELF
file and the addresses in memory. */
char *l_name; /* Absolute file name object was found in. */
ElfW(Dyn) *l_ld; /* Dynamic section of the shared object. */
struct link_map *l_next, *l_prev; /* Chain of loaded objects. */
};
Thus, the third printed quadword (octabyte) is the member l_ld. When started without the dynamic loader's name on the commandline, l_ld = 0x407078. When started with the dynamic loader's name prepended to the command line ("gdb --args /lib64/ld-linux-x86-64.so.2 ./o"), it shows l_ld = 0x408078.
Why the difference?
- It is easy to see the corrunpt PT_DYNAMIC values:
x/18xg 0x407078
Also, even when the loader's name is prepended to the command line, and l_ld is correct, and PT_DYNAMIC is not corrupt at 0x408078 - there still is a copy of it at 0x407078, and it is not corrupt either.
So how do i get it to work and load my segments properly?
It's interesting that link_map's first member, l_addr, is -4096 (-0x1000), which is exactly the difference between 0x407078 and 0x408078. So it looks like the loader (dynamic or static?) makes a deliberate decision at some point to load the segment at an address dirrefent from what the ELF file is requesting. Why does it?