Suppose the binary is PIC, how can I load it into memory and execute the entry point?
I'm doing this to get familiar with ELF so execve
is not allowed.
Asked
Active
Viewed 5,273 times
10
-
This is a very important task that requires knowledge about much more than just ELF. You might want to make an ELF file parser rather than a full-blown loader to get familiar with the format. You'll need a parser anyways to get the address of the `main` function. – zneak Jul 02 '11 at 02:57
-
Interesting question but I hesitate to give +1 because it's very terse and doesn't provide much information on your background, what you've already read or tried, etc... – R.. GitHub STOP HELPING ICE Jul 02 '11 at 04:44
-
Read the kernel source, it's pretty clear: http://stackoverflow.com/questions/8352535/how-does-kernel-get-an-executable-binary-file-running-under-linux – Ciro Santilli OurBigBook.com Jul 13 '15 at 22:12
1 Answers
9
These are the basic steps:
- Read the program headers to find the LOAD directives and determine the total length of mappings you'll need, in pages.
- Map the lowest-address LOAD directive with the total length (which may be greater than the file length), letting
mmap
assign you an address. This will reserve contiguous virtual address space. - Map the remining LOAD directives over top of parts of this mapping using
MAP_FIXED
. - Use the program headers to find the
DYNAMIC
vector, which will in turn give you the address of the relocation vector(s). - Apply the relocations. Assuming your binary was a static-linked PIE binary, they should consist entirely of
RELATIVE
relocations (just adding the base load address), meaning you don't have to perform any symbol lookups or anything fancy. Construct an ELF program entry stack consisting of the following sequence of system-word-sized values in an array on the stack:
ARGC ARGV[0] ARGV[1] ... ARGV[ARGC-1] 0 ENVIRON[0] ENVIRON[1] ... ENVIRON[N] 0 0
(This step requires ASM!) Point the stack pointer at the beginning of this array and jump to the loaded program's entry point address (which can be found in the program headers).

R.. GitHub STOP HELPING ICE
- 208,859
- 35
- 376
- 711
-
1Er, I thought the point of it being position independent code is it doesn't have relocations - it has global offset table or whatever. – Random832 Jul 02 '11 at 04:27
-
@Random832 ,will I still be able to implement a loader and execute the binary if it's not PIC in the first place? – Je Rog Jul 02 '11 at 04:29
-
1@Random832: You **always** have the potential for data relocations. How else could `static int x, *y=&x;` work? And of course the GOT is full of relocations. The only way to avoid having any relocations is to avoid having any global data. – R.. GitHub STOP HELPING ICE Jul 02 '11 at 04:41
-
1@R..: I don't think a statically-linked non-PIE executable needs any relocation. – ninjalj Jul 02 '11 at 08:46
-
-
-
-
I was talking about position-independent code. Obviously a fixed-address, static-linked program does not need relocations. – R.. GitHub STOP HELPING ICE Jul 02 '11 at 12:15
-
And regarding `auxv`: The final 0 in my stack example was the zero-length `auxv`. Passing a nontrivial `auxv` could be a good idea, but you can't just copy the one the kernel gave you; you have to modify it to match the load address, program header address, etc. for the new program. – R.. GitHub STOP HELPING ICE Jul 03 '11 at 15:13
-
@R..,one of the specs I don't understand is : why `loadable process segments must have congruent values for p_vaddr and p_offset, modulo the page size`? – Je Rog Jul 06 '11 at 23:54
-
Virtual memory mapping is on page granularity (that's the definition of page, actually). Thus if `p_vaddr` and `p_offset` were not congruent modulo the page size, there would be no way to satisfy the `LOAD` with `mmap`. The segment would have to be completely copied to new anonymous pages. – R.. GitHub STOP HELPING ICE Jul 07 '11 at 00:09
-