1

I'm trying to mock a static function without modifying the source code. This is because we have a large legacy base of code and we would like to add test code without requiring developers to go through and change a bunch of the original code.

Using objcopy, I can play with functions between object files, but I can't affect internal linkages. In other words, in the code below, I can get main.cpp to call a mocked up foo() from bar.c, but I cannot get UsesFoo() to call the mocked up foo() from bar.c.

I understand this is because foo() is already defined within foo.c. Aside from changing the source code, is there any way I can use ld or another tool to rip out foo() so that final linking pulls it in from my bar.c?

foo.c

#include <stdio.h>

static void foo()
{
    printf("static foo\n");
}

void UsesFoo()
{
    printf("UsesFoo(). Calling foo()\n");
    foo();
}

bar.c

#include <stdio.h>

void foo()
{
    printf("I am the foo from bar.c\n");
}

main.cpp

#include <iostream>

extern "C" void UsesFoo();
extern "C" void foo();

using namespace std;

int main()
{
    cout << "Calling UsesFoo()\n\n";
    UsesFoo();
    cout << "Calling foo() directly\n";
    foo();
    return 0;
}

compiling:

gcc -c foo.c
gcc -c bar.c
g++ -c main.c
(Below simulates how we consume code in the final output)
ar cr libfoo.a foo.o
ar cr libbar.a bar.o
g++ -o prog main.o -L. -lbar -lfoo
This works because the foo() from libbar.a gets included first, but doesn't affect the internal foo() in foo.o

I have also tried:

gcc -c foo.c
gcc -c bar.c
g++ -c main.c
(Below simulates how we consume code in the final output)
ar cr libfoo.a foo.o
ar cr libbar.a bar.o
objcopy --redefine-sym foo=_redefinedFoo libfoo.a libfoo-mine.a
g++ -o prog main.o -L. -lbar -lfoo-mine
This produces the same effect. main will call foo() from bar, but UsesFoo() still calls foo() from within foo.o
Maxthecat
  • 1,250
  • 17
  • 37
  • `static void foo()` won't be visible outside of `foo.c` even in the same library - it's totally inaccessible from outside the library. The symbol name `foo` probably doesn't even exist. – Andrew Henle Dec 21 '21 at 17:14
  • ["We can solve any problem by introducing an extra level of indirection."](https://en.wikipedia.org/wiki/Fundamental_theorem_of_software_engineering) Without a level of indirection, then you cannot force internal callers to use your mocked version. The only way I can think of that doesn't involve touching the internal code is to write a code processor that runs as part of your build process to create the *actual code* that gets compile. From there you can tweak it to replace calls to `foo`. I don't know if this fits your use case though; it's probably best to change the legacy code somehow. – AndyG Dec 21 '21 at 17:23
  • @AndrewHenle The static symbol is definitely visible. You can see it with a "readelf -s foo.o", but it's defined LOCAL, which is exactly as you would expect. I tried using objcopy to make it global and then redefine its name, but it didn't change the outcome. – Maxthecat Dec 21 '21 at 17:39
  • @AndyG Thanks, I may have to go that route, although I was hoping to avoid it. – Maxthecat Dec 21 '21 at 17:39
  • 1
    @Maxthecat *The static symbol is definitely visible.* This time, for this compilation. Try changing the optimization level, or stripping the resulting binary. Static functions are not meant to be visible outside of the single compilation unit, so they don't even have to exist in the final binary as symbols at all. And the fact that someone took the time to make them static means they have names that were never meant to be visible. Given that all functions in C reside in one single namespace, blindly changing functions never meant to be visible so that they are visible is highly risky. – Andrew Henle Dec 21 '21 at 17:43
  • @AndrewHenle Your criticism is not applicable. I am aware of what my build is doing. We are not changing the optimization level or stripping the resulting binary at this point in the operation. I'm aware that static functions aren't "meant" to be visible. I'm trying to see if we can bend them to our will for dependency injection. – Maxthecat Dec 21 '21 at 17:45
  • @Maxthecat *Your criticism is not applicable* You ***hope***. And that's an awfully tenuous thread to depend on when testing on a "large legacy base of code". How do you intend to ensure that the test results from whatever hack you might come up with accurately reflect what your production, unhacked code actually does? Because you won't be testing the unhacked code. – Andrew Henle Dec 21 '21 at 17:50

2 Answers2

0

I think you can try --wrap flag in gcc. An example for using the flag: How to wrap functions with the `--wrap` option correctly?

I use --wrap flag with static function I saw It still work except I can't call original __real_foo() function. if you accept the limitation you can try this.

main.c

  #include <stdio.h>
    
    //extern int __real_foo();
    extern int foo();
    int __wrap_foo() {
        printf("wrap foo\n");
        //__real_foo();
        return 0;
    }
    
    int main () {
        printf("foo:");foo();
        printf("wrapfoo:");__wrap_foo();
    
        return 0;
    }

foo.c

#include <stdio.h>
static int foo() {
    printf("foo\n");
    return 0;
}

termial output:

└─[0] <> gcc main.c foo.c -Wl,--wrap=foo -o main && ./main 
foo:wrap foo
wrapfoo:wrap foo
┌─[longkl@VN] - [~/test] - [2021-12-22 10:13:54]
└─[0] <> gcc --version
gcc (Ubuntu 9.3.0-17ubuntu1~20.04) 9.3.0
Copyright (C) 2019 Free Software Foundation, Inc.
This is free software; see the source for copying conditions.  There is NO
warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.
long.kl
  • 670
  • 4
  • 12
0

long.kl's answer works if you're willing to change the source code. Unfortunately, because we want to keep source code as pristine as possible, this was not usable for us.

Despite what AndrewHenle thinks in his responses, we can rewrite the object file to allow us to overwrite the static function. This requires understanding and parsing the ELF format the object file is written with.

The chief issue is that functions within your object file will use relative jumps/branches/calls to addresses in the text segment. In other words, let's assume we have the following code:

#include <stdio.h>

static void foo() 
{
    printf("static foo\n");
}

void UsesFoo()
{
    printf("UsesFoo(). Calling foo()\n");
    foo();
}

In this case, with no optimizations ("gcc -c foo.c"), this produces an object file, foo.o, which has the following disassembly:

objdump -d foo.o

foo.o:     file format elf64-x86-64


Disassembly of section .text:

0000000000000000 <foo>:
   0:   55                      push   %rbp
   1:   48 89 e5                mov    %rsp,%rbp
   4:   48 8d 3d 00 00 00 00    lea    0x0(%rip),%rdi        # b <foo+0xb>
   b:   e8 00 00 00 00          callq  10 <foo+0x10>
  10:   90                      nop
  11:   5d                      pop    %rbp
  12:   c3                      retq   

0000000000000013 <UsesFoo>:
  13:   55                      push   %rbp
  14:   48 89 e5                mov    %rsp,%rbp
  17:   48 8d 3d 00 00 00 00    lea    0x0(%rip),%rdi        # 1e <UsesFoo+0xb>
  1e:   e8 00 00 00 00          callq  23 <UsesFoo+0x10>
  23:   b8 00 00 00 00          mov    $0x0,%eax
  28:   e8 d3 ff ff ff          callq  0 <foo>
  2d:   90                      nop
  2e:   5d                      pop    %rbp
  2f:   c3                      retq   

Take a look at instructions 0xb and 0x1e. Those are the calls that printf() in the c code was translated to. You'll notice after the opcode 0xe8, the rest of the bytes are 0x00. This is because they will be replaced by the linker during final compilation to the address of puts (assuming this is a static linkage).

Now notice the call instruction at 0x28 is using the address of 0xd3 ff ff ff for it's call. If this was a non-static function, we'd see the same 0x00 bytes after the opcode, but in this case we see 0xd3ffffff. This is a 32 bit relative call that corresponds to -1 in 2's compliment (the final address will become 0 in the instruction pointer). This means our text segment (code) has been hardcoded to use that address.

In order to get around this, we will have to re-write the ELF to change how the call to foo() is handled. There's a couple of options:

  1. We add another .text.[somename] section to our file that contains code to act as a trampoline, ie: FakeFoo(). We then re-write the first instruction of foo() to jump immediately to FakeFoo(). Hacky, but probably works with loss of debugging information.

  2. The .rela.text section contains function relocations. These are used to tell the linker that we need to replace bytes for calls with final locations. When the linker sees this section, it will replace the addresses in the "offset" field with the real, calculated, addresses in the final binary. For our binary, we see:

readelf -r foo.o

Relocation section '.rela.text' at offset 0x280 contains 4 entries:
  Offset          Info           Type           Sym. Value    Sym. Name + Addend
000000000007  000500000002 R_X86_64_PC32     0000000000000000 .rodata - 4
00000000000c  000b00000004 R_X86_64_PLT32    0000000000000000 puts - 4
00000000001a  000500000002 R_X86_64_PC32     0000000000000000 .rodata + 7
00000000001f  000b00000004 R_X86_64_PLT32    0000000000000000 puts - 4

The offsets 0xc and 0x14 are where the call instructions in foo() and UsesFoo() are looking for the puts() function (note: the compiler translated our call to "printf()" to use "puts()").

So, we can add another entry here for the call at instruction 0x28, and have the linker look for another function called "foo()" somewhere in the code that is not declared static.

This will also require fixing up the .symtab entry of the ELF file, because it will contain a reference to the local function foo():

readelf -s foo.o

Symbol table '.symtab' contains 13 entries:
   Num:    Value          Size Type    Bind   Vis      Ndx Name
     0: 0000000000000000     0 NOTYPE  LOCAL  DEFAULT  UND 
     1: 0000000000000000     0 FILE    LOCAL  DEFAULT  ABS foo.c
     2: 0000000000000000     0 SECTION LOCAL  DEFAULT    1 
     3: 0000000000000000     0 SECTION LOCAL  DEFAULT    3 
     4: 0000000000000000     0 SECTION LOCAL  DEFAULT    4 
     5: 0000000000000000     0 SECTION LOCAL  DEFAULT    5 
     6: 0000000000000000    19 FUNC    LOCAL  DEFAULT    1 foo
     7: 0000000000000000     0 SECTION LOCAL  DEFAULT    7 
     8: 0000000000000000     0 SECTION LOCAL  DEFAULT    8 
     9: 0000000000000000     0 SECTION LOCAL  DEFAULT    6 
    10: 0000000000000000     0 NOTYPE  GLOBAL DEFAULT  UND _GLOBAL_OFFSET_TABLE_
    11: 0000000000000000     0 NOTYPE  GLOBAL DEFAULT  UND puts
    12: 0000000000000013    29 FUNC    GLOBAL DEFAULT    1 UsesFoo

In order to make the linker look for foo() outside of this object file, we'll have to change the entry for foo to be a "NOTYPE GLOBAL" "UND" type, so the linker doesn't think that it exists in this file.

There's another section, .rela.eh_frame, used for debugging, that you'll also want to pay attention to.

Finally, this approach requires you to go through your binary, search for opcodes that correspond to jumps/calls/branches, and create/fix entries so that the linker will look for "foo()" in other object files.

All of this is just to get the linker to look for foo() in a different file, so that you can replace the original foo() with one that you've written. If you want to call the original foo() after all of this, you'd probably want to rename foo() to something else, ie: _real_foo(), and setup the symbol table (.symtab) so that your fake foo() can do something like:

bar.c:

void foo()
{
  printf("I am the fake foo! Calling the real foo!\n");
  __real_foo();
}

Ultimately, it would be far better (and way easier) if your developers moved the bulk of their functionality from static methods to global ones. However, if you want to re-write the object file after it has been created, under the right circumstances, it can be done with a fair amount of effort.

Maxthecat
  • 1,250
  • 17
  • 37