8

Let's suppose I have a C file with no external dependency, and only const data section. I would like to compile this file, and then get a binary blob I can load in another program, where the function would be used through a function pointer.

Let's take an example, here is a fictionnal binary module, f1.c

static const unsigned char mylut[256] = {
    [0 ... 127] = 0,
    [128 ... 255] = 1,
};

void f1(unsigned char * src, unsigned char * dst, int len)
{
    while(len) {
        *dst++ = mylut[*src++];
        len--;
    }
}

I would like to compile it to f1.o, then f1.bin, and use it like this in prog.c

int somefunc() {
    unsigned char  * codedata;
    f1_type_ptr  f1_ptr;
    /* open f1.bin, and read it into codedata */

    /* set function pointer to beginning of loaded data */
    f1_ptr =(f1_type_ptr)codedata;

    /* call !*/
    f1_ptr(src, dst, len);
}

I suppose going from f1.c to f1.o involves -fPIC to get position independance. What are the flags or linker script that I can use to go from f1.o to f1.bin ?

Clarification :

I know about dynamic linking. dynamic linking is not possible in this case. The linking step has to be cast func pointer to loaded data, if it is possible.

Please assume there is no OS support. If I could, I would for example write f1 in assembly with PC related adressing.

shodanex
  • 14,975
  • 11
  • 57
  • 91
  • Do you know that you can use shared object files? You compile your .c file to a .so, then you load it `dlopen()` into you program, and get the function pointer `dlsym()` to the function. Then you can call it. – Didier Trosset Aug 27 '12 at 08:26
  • Let's forget libc and dynamic linking – shodanex Aug 27 '12 at 08:33
  • You want the `f1.bin` thingie loaded dynamically (i.e. in runtime)? Then you have to build a shared library, and use ldopen()+ldsym() or other module loader (like gmodule). Trying to do it some other way is likely to be hard and refused because of potential security threats (executing data segment and so on). – Michał Górny Aug 27 '12 at 08:36

3 Answers3

14

First of all, as other said you should consider using a DLL or SO.

That said, if you really want to do this, you need to replace the linker script. Something like this (not very well tested, but I think it works):

ENTRY(_dummy_start)
SECTIONS
{
    _dummy_start = 0;
    _GLOBAL_OFFSET_TABLE_ = 0;
    .all : { 
        _all = .;
        LONG(f1 - _all);
        *( .text .text.* .data .data.* .rodata .rodata.* ) 
    }
}

Then compile with:

$ gcc -c -fPIC test.c

Link with:

$ ld -T script.ld test.o -o test.elf

And extract the binary blob with:

$ objcopy -j .all -O binary test.elf test.bin

Probably some explanation of the script is welcome:

  • ENTRY(_dummy_start) That just avoids the warning about the program not having an entry point.
  • _dummy_start = 0; That defines the symbol used in the previous line. The value is not used.
  • _GLOBAL_OFFSET_TABLE_ = 0; That prevents another linker error. I don't think you really need this symbol, so it can be defined as 0.
  • .all That's the name of the section that will collect all the bytes of your blob. In this sample it will be all the .text, .data and .rodata sections together. You may need some more if you have complicated functions, in this case objdump -x test.o is your friend.
  • LONG(f1 - _all) Not really needed, but you want to know the offset of your function into the blob, don't you? You cannot assume that it will be at offset 0. With this line the very first 4 bytes in the blob will be the offset of the symbol f1 (your function). Change LONG with QUAD if using 64-bit pointers.

UPDATE: And now a quick'n'dirty test (it works!):

#include <stdio.h>
#include <stdlib.h>
#include <sys/mman.h>

typedef void (*f1_t)(char *a, char *b, int len);
f1_t f1;

int main()
{
    char *blob = (char*)valloc(4096);
    FILE *f = fopen("test.bin", "rb");
    fread(blob, 1, 4096, f);
    fclose(f);

    unsigned offs = *(unsigned*)blob;
    f1 = (f1_t)(blob + offs);
    mprotect(blob, 4096, PROT_READ | PROT_WRITE | PROT_EXEC);
    char txt[] = "¡hello world!";
    char txt2[sizeof(txt)] = "";
    f1(txt, txt2, sizeof(txt) - 1);
    printf("%s\n%s\n", txt, txt2);
    return 0;

}
rodrigo
  • 94,151
  • 12
  • 143
  • 190
  • `_GLOBAL_OFFSET_TABLE_` is probably needed if reference to other symbols (such as a standard library) are made. – AProgrammer Aug 27 '12 at 09:53
  • @AProgrammer: But the OP specifically says _no external dependency_, so it is probably not needed. If he does access to any library then he will have to statically link all the libraries in the blob or do the dynamic linking himself... and that would be... complicated. – rodrigo Aug 27 '12 at 09:58
  • You are my hero – None Sep 07 '21 at 19:18
2

You should consider building a shared library (.dll for windows, or .so for linux).

Build the lib like this :

gcc -c -fPIC test.c
gcc -shared test.o -o libtest.so

If you want to load the library dynamically from your code, have a look at the functions dlopen(3) and dlsym(3).

Or if you want to link the library at the compile time, build the program with

gcc -c main.c
gcc main.o -o <binary name> -ltest

EDIT:

I'm really not sure about what I will say here, but this could give you a clue to progress in your research ...

If you don't want to use dlopen and dlsym, you can try to read the symbol table from the .o file in order to find the function address, and then, mmap the object file in memory with the read and execute rights. Then you should be able to execute the loaded code at the address you found. But be carefull with the other dependencies you could meet in this code.

You can check man page elf(5)

phsym
  • 1,364
  • 10
  • 20
  • I precisely don't want to use dynamic linking, edited the question accordingly – shodanex Aug 27 '12 at 08:31
  • **dlopen** and **dlsym** don't imply dynamic linking (since your program won't be linked to the library, your binary won't depend on it, and the library won't be necessary during compilation). The function **dlopen** lets you load a library, and **dlsym** will return the function address, based on the symbol name you provided. – phsym Aug 27 '12 at 08:36
  • dlsym means calling the OS provided dynamic linker to analyse the library file, perform mapping etc.... – shodanex Aug 27 '12 at 08:49
0

Use a cast function pointer.

Here's an example:

#include <stdio.h>

int main()
{
    unsigned char *dst, *src;
    int len;
    void (*f1)(unsigned char *, unsigned char *, int);
    *(void **)(&f1) = 0x..........;
    f1(src,dst,len);
    return 0;
}

To do any more, you'd really need a linker and a dynamic loader.

S.S. Anne
  • 15,171
  • 8
  • 38
  • 76