Can you programmatically detect the size in bytes of a function in C

Question

I am writing code for an embedded system where it is more efficient to copy code from ROM to the SOC's internal memory then execute it. Is there any way to programmatically get the size in bytes of the function so I can use a function like memcpy to copy the function's instructions to internal memory.

Is there a better way to do this?

What tool chain are you using? How big is the ROM? How big is the RAM? — jwdonahue, Jun 30 '18 at 06:40
You can get the address of a function, then maybe if you compute the difference between the addresses of two adjacent functions, you'll get the size of the first one. I'm not sure, just an idea. — Joël Hecht, Jun 30 '18 at 06:54
Yes it’s possible. You can think of using sizeof() function. — danglingpointer, Jun 30 '18 at 06:55
@LethalProgrammer sizeof() returns the size of a data type. So sizeof(function) would return the size of a pointer to said function, not actual function's size in memory. — Dillon Davis, Jun 30 '18 at 07:03
Without knowing what the function does it’s not possible to say how to detect the size of function. You have to provide the code of function which you want to know the size of that function and which platform your code in running in? — danglingpointer, Jun 30 '18 at 07:06
@JoëlHecht the issue is that when you are writing your C code, you don't know for certain the order the linker will assign functions to in memory. So one function between two others in the C source code might not necessarily appear between then in memory at runtime. This is why I believe its necessary to read the object file. — Dillon Davis, Jun 30 '18 at 07:08
@LethalProgrammer exactly- I don't believe its possible to know the size of the function prior to compilation for that reason. I do believe there's probably an easier way to obtain that from the object file that what I've suggested, however I am not aware of any other means to do so. — Dillon Davis, Jun 30 '18 at 07:12
@DillonDavis for functions that are in the same .c file, the compiler will generate one .o (or .obj). The linker won't change anything in the object file, functions will keep the same order, which is _most_ _probably_ the order they appears in the source, but yes, it may depends of the compiler. — Joël Hecht, Jun 30 '18 at 08:11
@JoëlHecht, however the linker will assign what virtual addresses each symbol is assigned to, which will change the order they are loaded in memory. Therefore getting the address and taking the difference between surrounding functions may not yield the results you expect. — Dillon Davis, Jun 30 '18 at 08:15
@DillonDavis, applying `sizeof` to a function is actually a "constraint violation", that is a syntax error. — Jens Gustedt, Jun 30 '18 at 09:19
@JensGustedt what compiler are you using. I just tried in GCC, `sizeof(foobar); `for a `int foobar(void);`, with -Wall and -Werror, and it compiled with no warnings nor errors. — Dillon Davis, Jun 30 '18 at 09:31
Ah, I see. GCC allows it for diagnostic purposes it would seem, but you are correct. [Link](https://stackoverflow.com/questions/12259101/why-is-the-size-of-a-function-in-c-always-1-byte) for any future readers. — Dillon Davis, Jun 30 '18 at 09:35
@DillonDavis I think this is cleared up by now, but: LethalProgrammer was wrong that `sizeof(f)` gives the size of a function, but you're wrong that it gives the size of a function pointer instead. Just like arrays, `sizeof()` is one place where functions *don't* automatically have their address taken. — Steve Summit, Jun 30 '18 at 10:06
This sound dubious. It implies that the function code is position-independent or relocatable. Also, what if it calls something? — Martin James, Jun 30 '18 at 10:12
Even if you manage to evaluate the function size, on lots of CPUs and architectures you will have problems running the copied-over code, because that will not be position independent. — tofro, Jun 30 '18 at 13:10
You should reformulate your question. This is an XY problem; you are asking how to determine the size of a function, bit it appears that what you really want to do is have the code stored in ROM but executed from RAM. This is a common requirement, and yes there is a better way. In fact your method is unlikely to work since it is not likely to be position independent code. Ask a question about what you are trying to achieve, not one about how you are trying to achieve it. — Clifford, Jul 01 '18 at 12:38
I think because of the "embedded" tag this question is not direct duplicate. Referring to the @Clifford answer in mentioned thread (if your toolchain allows for that) you can place a function in separate section (e.g. using linker file). At this moment you know a start address. You have 2 options to get a size of this function: 1st is to create a "dummy" function after that in the same section (specify in the linker file that this function is after the first one). 2nd - If the function and compilation settings won't change often you can just take the size from previous build. — Mikolaj, Jul 04 '18 at 14:03
@Mikolaj I don't see that the embedded tag makes any difference to the question or the answer. Your approach however brings something new to the solution, but only in the sense that it makes the adjacency explicit. That technique is not unique to embedded systems. — Clifford, Jul 04 '18 at 18:32
@Mikolaj, it is academic in any case, because this question is asking about how to do something as a solution to a different problem, even though as a solution it is flawed. The OP would do better to ask how to separate load regions from execution regions - which also involves the linker as it happens. — Clifford, Jul 04 '18 at 18:37

score 1 · Answer 1 · answered Jun 30 '18 at 07:00

Is there a better way to do this?

There likely is, and I truly hope someone else responds with an answer that provides a simpler approach, but for now I'll shed some details on your suggested method.

If the program is compiled in ELF format (I don't know for the other formats), all your functions will be included in the .text section of the ELF file. You can use the symbol table to find this function in the text section. To get the size of this function, you might be able to use the st_size member of the Elf64_Sym or Elf32_Sym struct, but I'm not entirely certain that will give the correct size. What you could do (a little hacky, admittedly) is iterate through the other symbols, and find the one immediately after it, and subtract to get the size. Of course you'd have to keep in mind alignment rules, but that's not too much of an issue- if you copy extra bytes, they won't be executed anyways.

Also keep in mind that some code get compiled with certain assumptions about its offset in memory. You'll might need to manually patch the GOT and/or PLT if you copy the function directly into memory. Know you should probably compile the function you want to include with -PIC and -fPIC for position independent code, at least in GCC.

If you need more details on how to access the symbol table, or the text section of your ELF, I could add more details.

This all sounds nightmarish. What if it calls something else? — Martin James, Jun 30 '18 at 10:13
OP only asked how to get the function size- so that's the only part I went into detail about. If it calls something else, OP will need to handle resolving those symbols himself. I've wrote a dynamic loader before, and have experience with symbol relocation and patching the PLT and GOT, so I can explain those explicitly if OP asks. For a high-level overview- OP should compile his source as PIC, memcpy all required function code, and then add any external libraries used to the .dynamic section so the dynamic loader will handle those at runtime. — Dillon Davis, Jun 30 '18 at 18:02

score 0 · Answer 2 · answered Jun 30 '18 at 12:34

With some compilers you can get a function size by computing the difference between the function's address and the address of another function that immediatly follows the first one.

But it really depends of the compiler. With Visual C++ for example, both functions has to be static functions. With GCC, it does not work anymore if optimization O2 or better is activated.

And even if you manage to copy your function elsewhere in memory, you may not be able to use it, especially if it refers other functions, or if it refers global/static variables, or if the code is not position independant, etc.

So this is a simple solution, it may work in your case, but it can't be considered as a general solution.

Below there's an example that works with gcc and visual C++, tested on windows 10 and WSL (do not activate optimizations with gcc).

#include <stdio.h>
#include <string.h>
#ifdef _WIN32
#include <windows.h>
#endif
#ifdef __linux
#include <sys/mman.h>
#endif

// The function to copy
static int fib(int m)
{
    int v1 = 0, v2 = 1, n;

    if (m == 0) return 0;
    for (n = 1; n < m; n++)
    {
        int v = v1 + v2;
        v1 = v2;
        v2 = v;
    }
    return v2;
}

static void endFib(void)
{
    // This function follow immediatly the fib function
    // and it exists only to get its address and compute the size of fib function
}

int main(int argc, char *argv)
{
    long sizeFib;
    int (*copyFib)(int);

    printf("&fib=%p\n", (char *)fib);
    sizeFib = (char *)endFib - (char *)fib;
    printf("size of fib : %ld\n", sizeFib);

    printf("fib(8) : %d\n", fib(8));

    // For the example the allocated copy must be in an executable part of the memory.
#ifdef _WIN32
    copyFib = VirtualAlloc(NULL, sizeFib, MEM_COMMIT, PAGE_EXECUTE_READWRITE);
#endif
#ifdef __linux
    copyFib = mmap(NULL, sizeFib, PROT_EXEC | PROT_READ | PROT_WRITE, MAP_PRIVATE | MAP_ANONYMOUS, 0, 0);
#endif
    memcpy(copyFib, fib, sizeFib);
    printf("&copyFib=%p\n", copyFib);
    printf("copyFib(8) : %d\n", copyFib(8));

    return 0;
}

Can you programmatically detect the size in bytes of a function in C

2 Answers2