1

I'm trying to write a basic userspace ELF loader that should be able to load statically linked (not dynamically linked) non-relocatable binaries (i.e. not built with -pie, -fPIE and so on). It should work on x86 CPU's for now.

I've followed the code on loading ELF file in C in user space and it works well when the executable is relocatable, but as expected completely fails if it isn't since the program is loaded in the wrong virtual memory range and instantly crashes.

But I tried modifying it to load the program at the virtual offset it expects (using phdr.p_vaddr) but I ran into a complication: my loader is already using that virtual memory range! I can't mmap it, much less write anything into it. How do I proceed so that I can load my non-relocatable binary into my loader's address space without overwriting the loader's own code before it's finished? Do I need to get my loader to run from a completely different virtual memory range, perhaps by getting the linker to link it way above the usual virtual memory range for a non-relocatable binary (which happens to start at 0x400000 in my case) or is there some trick to it?

I've read the ELF documentation (I am working with ELF64 here by the way, but I think ELF32 and ELF64 are very similar) and a lot of documents on the web and I still don't get it.

Can someone explain how an ELF loader deals with this particular complication? Thanks!

Community
  • 1
  • 1
Thomas
  • 3,321
  • 1
  • 21
  • 44
  • 1
    But of course, you've already nailed it. Only one out of two different things can be found at any single address. If the object being loaded can't be relocated, it follows that the ELF loader must relocate _itself_ elsewhere. I imagine the ELF loader can 1. Compute where the ELF image it read will be found in memory, 2. Relocate itself anywhere else, 3. Map the ELF file and 4. Execute it. – Iwillnotexist Idonotexist Apr 17 '15 at 03:23

1 Answers1

3

Archimedes called "heureka" when he found that at a location can only be one object. If your ELF binary must be at one location because you can't rebuild it for another location you have to relocate the loader itself.

The non-relocatable ELF doesn't include enough Information to move it to a different address. You could probably write a decompiler that detects all address references in the code but it's not worth. You will have problems when you try to analyze data references like pointers stored in pre-initialized variables.

Rewrite the loader if you can't get the source code of you ELF binary or a relocatable version.

BTW: Archimedes heureka was deadly for the goldsmith who cheated. I hope it's not so expensive in your case.

harper
  • 13,345
  • 8
  • 56
  • 105
  • Can a program unmap its own .text? – n. m. could be an AI Apr 17 '15 at 05:19
  • 1
    Thanks, what I did was modify the loader's linker script to relocate itself out of the way at compile-time with `-Ttext-segment=0x8000000` (for instance), then I was able to successfully load my binary where it belongs inside the loader's address space. Cheers! And no making the original binary relocatable is not an option in my case. – Thomas Apr 17 '15 at 14:03
  • @n.m: Yes, it can unmap its own segment. However, it might segfault trying to execute code in the unmapped segment. You might want to first load some code somewhere and execute in order to have existing code when you return from the system call. – ysdx Sep 01 '15 at 22:34