13

Problem

I wish to inject an object file into an existing binary. As a concrete example, consider a source Hello.c:

#include <stdlib.h>

int main(void)
{
    return EXIT_SUCCESS;
}

It can be compiled to an executable named Hello through gcc -std=gnu99 -Wall Hello.c -o Hello. Furthermore, now consider Embed.c:

func1(void)
{
}

An object file Embed.o can be created from this through gcc -c Embed.c. My question is how to generically insert Embed.o into Hello in such a way that the necessary relocations are performed, and the appropriate ELF internal tables (e.g. symbol table, PLT, etc.) are patched properly?


Assumptions

It can be assumed that the object file to be embedded has its dependencies statically linked already. Any dynamic dependencies, such as the C runtime can be assumed to be present also in the target executable.


Current Attempts/Ideas

  • Use libbfd to copy sections from the object file into the binary. The progress I have made with this is that I can create a new object with the sections from the original binary and the sections from the object file. The problem is that since the object file is relocatable, its sections can not be copied properly to the output without performing the relocations first.
  • Convert the binary back to an object file and relink with ld. So far I tried using objcopy to perform the conversion objcopy --input elf64-x86-64 --output elf64-x86-64 Hello Hello.o. Evidently this does not work as I intend since ld -o Hello2 Embed.o Hello.o will then result in ld: error: Hello.o: unsupported ELF file type 2. I guess this should be expected though since Hello is not an object file.
  • Find an existing tool which performs this sort of insertion?

Rationale (Optional Read)

I am making a static executable editor, where the vision is to allow the instrumentation of arbitrary user-defined routines into an existing binary. This will work in two steps:

  1. The injection of an object file (containing the user-defined routines) into the binary. This is a mandatory step and can not be worked around by alternatives such as injection of a shared object instead.
  2. Performing static analysis on the new binary and using this to statically detour routines from the original code to the newly added code.

I have, for the most part, already completed the work necessary for step 2, but I am having trouble with the injection of the object file. The problem is definitely solvable given that other tools use the same method of object injection (e.g. EEL).

Mike Kwan
  • 24,123
  • 12
  • 63
  • 96
  • A quick read of the question leaves behind the feeling that the concept between a runtime-linker and a normal linker is not understood. The runtime-linker/program loeader operates only on formats that are easy and quick to fix up. .o is not one of those :-) If it has minimal dependencies, like a codec, linking with minimal code to make it a .so sounds like the logic route – Marco van de Voort Feb 27 '12 at 20:26
  • @MarcovandeVoort: Thanks for your comment :) I used the 'link' term loosely, as one might use 'inject', which is why I placed it in quotes. One of the reasons I am not able to make it a `.so`, is that injection tricks such as `LD_PRELOAD` can be subverted by the application. Not only this, it requires the distribution of an additional library which forms the new environment. Static detouring has various other advantages (particularly for the purposes of this project), but as I've already said both in the question and comments to answers, this is not a design decision I can change :) – Mike Kwan Feb 27 '12 at 21:30
  • Are you trying to do something like the ability of ld on AIX (and nowhere else that I know of) to relink an executable where only one object file has changed? – evil otto Mar 15 '12 at 16:55
  • @evilotto: I want to add a new object file which was never present before. – Mike Kwan Mar 15 '12 at 17:12
  • Would you mind sharing a brief sketch of how #2 under Rationale is possible? If you now know the answer to the OP I'd be very curious about that as well. – Praxeolitic Oct 15 '14 at 04:06
  • @Praxeolitic: the completed paper can be found here - http://www.doc.ic.ac.uk/teaching/distinguished-projects/2012/m.kwan%20.pdf – Mike Kwan Oct 16 '14 at 12:12
  • Oh wow, I already had that open in another tab. I hadn't noticed the matching names. Nice work! Where can I find the library itself? – Praxeolitic Oct 16 '14 at 12:30
  • Ah, never mind, found it: https://github.com/petrhosek/libbf – Praxeolitic Oct 16 '14 at 12:46

6 Answers6

7

If it were me, I'd look to create Embed.c into a shared object, libembed.so, like so:

gcc -Wall -shared -fPIC -o libembed.so Embed.c

That should created a relocatable shared object from Embed.c. With that, you can force your target binary to load this shared object by setting the environment variable LD_PRELOAD when running it (see more information here):

LD_PRELOAD=/path/to/libembed.so Hello

The "trick" here will be to figure out how to do your instrumentation, especially considering it's a static executable. There, I can't help you, but this is one way to have code present in a process' memory space. You'll probably want to do some sort of initialization in a constructor, which you can do with an attribute (if you're using gcc, at least):

void __attribute__ ((constructor)) my_init()
{
    // put code here!
}
Community
  • 1
  • 1
Dan Fego
  • 13,644
  • 6
  • 48
  • 59
  • Yes, this is an alternative for implementing detouring. In regards to the issue of how to implement the patching, it can be done with the __attribute__((constructor)) GCC attribute which allows a method to be invoked upon the library being loaded. The executable can also be tricked into thinking the shared object is a dependency. This is the approach used by an existing tool called LEEL. – Mike Kwan Feb 26 '12 at 02:41
  • Unfortunately, runtime/dynamic detouring is not going to be an acceptable solution. This was an explicitly declared requirement at the project inception. – Mike Kwan Feb 26 '12 at 02:42
4

Assuming source code for first executable is available and is compiled with a linker script that allocates space for later object file(s), there is a relatively simpler solution. Since I am currently working on an ARM project examples below are compiled with the GNU ARM cross-compiler.

Primary source code file, hello.c

#include <stdio.h>

int main ()
{

   return 0;
}

is built with a simple linker script allocating space for an object to be embedded later:

SECTIONS
{
    .text :
    {
        KEEP (*(embed)) ;

        *(.text .text*) ;
    }
}

Like:

arm-none-eabi-gcc -nostartfiles -Ttest.ld -o hello hello.c
readelf -s hello

Num:    Value  Size Type    Bind   Vis      Ndx Name
 0: 00000000     0 NOTYPE  LOCAL  DEFAULT  UND 
 1: 00000000     0 SECTION LOCAL  DEFAULT    1 
 2: 00000000     0 SECTION LOCAL  DEFAULT    2 
 3: 00000000     0 SECTION LOCAL  DEFAULT    3 
 4: 00000000     0 FILE    LOCAL  DEFAULT  ABS hello.c
 5: 00000000     0 NOTYPE  LOCAL  DEFAULT    1 $a
 6: 00000000     0 FILE    LOCAL  DEFAULT  ABS 
 7: 00000000    28 FUNC    GLOBAL DEFAULT    1 main

Now lets compile the object to be embedded whose source is in embed.c

void func1()
{
   /* Something useful here */
}

Recompile with the same linker script this time inserting new symbols:

arm-none-eabi-gcc -c embed.c
arm-none-eabi-gcc -nostartfiles -Ttest.ld -o new_hello hello embed.o

See the results:

readelf -s new_hello
Num:    Value  Size Type    Bind   Vis      Ndx Name
 0: 00000000     0 NOTYPE  LOCAL  DEFAULT  UND 
 1: 00000000     0 SECTION LOCAL  DEFAULT    1 
 2: 00000000     0 SECTION LOCAL  DEFAULT    2 
 3: 00000000     0 SECTION LOCAL  DEFAULT    3 
 4: 00000000     0 FILE    LOCAL  DEFAULT  ABS hello.c
 5: 00000000     0 NOTYPE  LOCAL  DEFAULT    1 $a
 6: 00000000     0 FILE    LOCAL  DEFAULT  ABS 
 7: 00000000     0 FILE    LOCAL  DEFAULT  ABS embed.c
 8: 0000001c     0 NOTYPE  LOCAL  DEFAULT    1 $a
 9: 00000000     0 FILE    LOCAL  DEFAULT  ABS 
10: 0000001c    20 FUNC    GLOBAL DEFAULT    1 func1
11: 00000000    28 FUNC    GLOBAL DEFAULT    1 main
fsheikh
  • 416
  • 3
  • 12
  • I am getting "hello: unsupported ELF file type 2" ... (compiled with arm-oe-linux-gnueabi/4.9.2/) – IvanDi Oct 18 '19 at 09:52
  • 1
    Did you try with arm-none-eabi-* tools? A tool-chain like https://developer.arm.com/tools-and-software/open-source-software/developer-tools/gnu-toolchain/gnu-rm/downloads – fsheikh Oct 22 '19 at 14:05
  • 1
    sorry for shamelessly asking, but can you please upvote the answer as well, if you it was useful to you. :D – fsheikh Oct 24 '19 at 07:29
  • Sure. It is useful and educational, although it doesn't work in my very case (gcc cross compiler for oelinux target). – IvanDi Oct 24 '19 at 12:20
  • I test this solution, it crashes the program. no execution, no gdb ... always Segmentation fault. – husin alhaj ahmade Jan 13 '23 at 05:36
2

The problem is that .o's are not fully linked yet, and most references are still symbolic. Binaries (shared libraries and executables) are one step closer to finally linked code.

Doing the linking step to a shared lib, doesn't mean you must load it via the dynamic lib loader. The suggestion is more that an own loader for a binary or shared lib might be simpler than for .o.

Another possibility would be to customize that linking process yourself and call the linker and link it to be loaded on some fixed address. You might also look at the preparation of e.g. bootloaders, which also involve a basic linking step to do exactly this (fixate a piece of code to a known loading address).

If you don't link to a fixed address, and want to relocate runtime you will have to write a basic linker that takes the object file, relocates it to the destination address by doing the appropriate fixups.

I assume you already have it, seeing it is your master thesis, but this book: http://www.iecc.com/linker/ is the standard introduction about this.

Marco van de Voort
  • 25,628
  • 5
  • 56
  • 89
  • I had actually also considered customising the linking process, which is what I asked in the question here: http://stackoverflow.com/questions/9508290/how-to-specify-base-addresses-for-sections-when-linking-or-alternatively-how-to. If I was able to link sections at a certain address, I think I would be able to copy them to the executable using `libbfd`. Do you know of a tool or linking option which would allow what you are suggesting (linking sections - not symbols - to fixed addresses)? – Mike Kwan Mar 02 '12 at 15:30
  • As already said in the other question: linker resource files are the way to go. – Marco van de Voort Mar 07 '12 at 11:51
1

You must make room for the relocatable code to fit in the executable by extending the executables text segment, just like a virus infection. Then after writing the relocatable code into that space, update the symbol table by adding symbols for anything in that relocatable object, and then apply the necessary relocation computations. I've written code that does this pretty well with 32bit ELF's.

elfmaster
  • 11
  • 1
  • Welcome to Stack Overflow. Please demonstrate some of this code you've written to solve this problem - it's all very well telling us you have it, but it's not helping right now. – michaelb958--GoFundMonica Dec 16 '13 at 06:41
0

Have you looked at the DyninstAPI? It appears support was recently added for linking a .o into a static executable.

From the release site:

Binary rewriter support for statically linked binaries on x86 and x86_64 platforms

80x25
  • 701
  • 5
  • 7
  • Thanks for this link. I had seen `Dyninst` before but was not aware it did static binary rewriting too. I shall look at this and update later. – Mike Kwan Apr 06 '12 at 12:43
0

You cannot do this in any practical way. The intended solution is to make that object into a shared lib and then call dlopen on it.

bmargulies
  • 97,814
  • 39
  • 186
  • 310
  • Thanks for your answer. Please see my comments to Dan Fego. Specifically this is a requirement I can not change. I'm not sure about that it cannot be done 'in a practical way' since the existing EEL tool does this. – Mike Kwan Feb 26 '12 at 02:44
  • I don't know what lunatic defined your requirements, but insisting that a .o be pullable in instead of a .so containing it meets my definition of 'lunatic'. My definition of 'practical' is 'with a level of effort remotely appropriate. If your management wants you to spend a ton of time to achieve this, you have my sympathy. – bmargulies Feb 26 '12 at 18:28
  • You have my sympathy. Your professor seems to have a problem sorting out the interesting research problem from the boring infrastructure. – bmargulies Feb 26 '12 at 19:21