1

I would like to call ARM/ARM64 ASM code from C++. ASM code contains syscall and a relocation to external function. ARM architecture here is not so important, I just want to understand how to solve my problem conceptually.

I have following ASM syscall (output from objdump -d) which is called inside shared library:

 198:   d28009e8    mov x8, #0x4f                   // #79
 19c:   d4000001    svc #0x0
 1a0:   b140041f    cmn x0, #0x1, lsl #12
 1a4:   da809400    cneg    x0, x0, hi
 1a8:   54000008    b.hi    0 <__set_errno_internal>
 1ac:   d65f03c0    ret

This piece of code calls fstatat64 syscall and sets errno through external __set_errno_internal function. readelf -r shows following relocation for __set_errno_internal function:

00000000000001a8 R_AARCH64_CONDBR19  __set_errno_internal

I want to call this piece of code from C++, so I converted it to buffer:

  unsigned char machine_code[] __attribute__((section(".text"))) =
        "\xe8\x09\x80\xd2"
        "\x01\x00\x00\xd4"
        "\x1f\x04\x40\xb1"
        "\x00\x94\x80\xda"
        "\x08\x00\x00\x54"   // Here we have mentioned relocation
        "\xc0\x03\x5f\xd6";

EDIT: Important detail - I chose to use buffer (not inline assembly etc) because I want to run extra processing on this buffer (for example decryption function on string literal as a software protection mechanism but that's not important here) before it gets evaluated as machine code.

Afterwards, buffer can be cast to function and called directly to execute machine code. Obviously there is a problem with relocation, it's not fixed automatically and I have to fix it manually. But during run-time I can't do it because .text section is read-only & executable.

Although I have almost full control over source code I must not turn off stack protection & other features to make that section writable (don't ask why). So it seems that relocation fix should be performed during link stage somehow. As far as I know shared library contains relative offsets (for similar external function calls) after relocations are fixed by linker and binary *.so file should contain correct offsets (without need of run-time relocation work), so fixing that machine_code buffer during linking should be possible.

I'm using manually built Clang 7 compiler and I have full control over LLVM passes so I thought maybe it's possible to write some kind of LLVM pass which executes during link time. Though it looks like ld is called in the end so maybe LLVM passes will not help here (not an expert here).

Different ideas would be appreciated also. As you can see problem is pretty complicated. Maybe you have some directions/ideas how to solve this? Thanks!

jozols
  • 560
  • 7
  • 22
  • 1
    Why can't you have that `__set_errno_internal` done on the C side after your function returns? Alternatively you could pass in the function address as an argument. – Jester Jul 05 '19 at 14:27
  • Ok, I will have to think about this. Currently I have lot of those syscall wrappers written in ASM so I thought it would be great if no changes would be needed for them. But I have to test this out first and see if it works. – jozols Jul 05 '19 at 14:51
  • 2
    Why would you write it as a buffer and not use inline assembler? Instead of taking the opcodes (hex numbers) take the assembler text and convert it to an inline assembler macro. Your normal build process will link this as per normal. See: [GCC pre-process as assembler](https://stackoverflow.com/questions/15465958/using-gccs-pre-processor-as-an-assembler). – artless noise Jul 05 '19 at 15:39
  • @artlessnoise Because I reuse 3rd party code from here: https://android.googlesource.com/platform/bionic/+/refs/heads/marshmallow-release/libc/arch-arm64/syscalls/. You can see there lot of syscalls. But maybe it's possible to integrate all syscall code unchanged by using inline assembler as you suggested. I will try that. One of reasons why I chose buffer is that I want to run encryption on it (as part of software protection). – jozols Jul 05 '19 at 15:58
  • Previously I used similar approach to inline assembly - placed assembly directly in *.S files and compiler builds *.S into object file (it has support for building C/C++ + assembly). But now I wanted to use buffer to run encryption over assembly :).So I don't think using that inline assembly is an option here. – jozols Jul 05 '19 at 16:08
  • 1
    @Jester Based on your idea I removed relocation by deleting `cmn`, `cneg` and `b.hi` instructions. What is nice - It's a very simple modification to ASM syscall wrapper. This way generally 0 (or positive value in case of other syscalls which return handles) or negative `errno` value (in case of error) is returned. After that I can write a C++ wrapper around it to set `errno` manually if negative value is returned from assembly code. Thanks! – jozols Jul 05 '19 at 17:52
  • In case somebody has an idea based on relocation fixing I would like to hear that. For now I will use assembly source modification suggested by @Jester. – jozols Jul 05 '19 at 17:54
  • 2
    You can run encryption on any code. Just use a linker or attribute and put in the input section (.text.encrypt). You can define variable to the start/end of this section and run online/offline encrypt/decrypt. Some ideas from [storing CRC in elf](https://stackoverflow.com/questions/24150030/storing-crc-into-an-axf-elf-file) can be used for encryption. I don't see where you are going to get a decrypt key that will stop an attacker though; but it is equivalent to the array without any limitations on relocations. – artless noise Jul 05 '19 at 18:55
  • @artlessnoise Thanks for valuable information, that link was really useful. With buffer approach I can automate conversion from assembly to buffer containing machine code (by running assembler over existing assembly (https://android.googlesource.com/platform/bionic/+/refs/heads/marshmallow-release/libc/arch-arm64/syscalls/) and then `objdump -d`). With inline assembly I'm not sure how to automate it. Maybe it would be enough to run C preprocessor over those *.S files because they contain C macros and then figure out how to call that piece of code. But have to test if that works. – jozols Jul 08 '19 at 14:32
  • @artlessnoise Can you elaborate on how inline assembly can fix relocation problem? Maybe with inline assembly code for my example if possible. I'm trying to write some piece of inline assembly code but relocation still seems to be the problem. – jozols Jul 08 '19 at 16:04
  • Nvm, got it running inline assembly by using preprocessor (-E) on *.S file. In inline assembly `b.hi` instruction looks like `b.hi __set_errno_internal` and all is resolved correctly. – jozols Jul 09 '19 at 13:35

1 Answers1

1

There's already a working, packaged mechanism to handle relocations. It's called dlsym(). While it doesn't directly give you a function pointer, all major C++ compilers support reinterpret_casting the result of dlsym to any ordinary function pointer. (Member functions are another issue altogether, but that's not relevant here)

MSalters
  • 173,980
  • 10
  • 155
  • 350
  • It's a nice idea, although I use `-fvisibility=hidden` so it will not work by default. I could probably make an exception and make that function visible. – jozols Jul 08 '19 at 14:39
  • Another problem with this is as I mentioned: "Obviously there is a problem with relocation, it's not fixed automatically and I have to fix it manually. But during run-time I can't do it because .text section is read-only & executable.". I assume that you intended to call `dlsym()` during runtime and patch relocation. – jozols Jul 08 '19 at 15:10
  • @jozols: Obviously you call `dlsym` at runtime, but there's no need for manual relocation handling. That's already handled by the ELF loader (the OS). `dlsym` returns a pointer either to the relocated function or to a trampoline. – MSalters Jul 08 '19 at 15:23
  • Yes, but I have to call it as part of my assembly/machine code (`"\x08\x00\x00\x54" // Here we have mentioned relocation`). – jozols Jul 08 '19 at 15:25
  • 1
    @jozols: We seem to not be communicating. That's trying to do a manual relocation, which I recommend not doing. The `b.hi` conditional branch is a **relative jump** to `__set_errno_internal`. Since you neither know where `__set_errno_internal` is relocated, nor do you know where the branch instruct itself will be, there's not even a guarantee that the difference between them fits in the 24 bits allowed for relative branches (!) – MSalters Jul 08 '19 at 15:34
  • I totally agree with you but avoiding assembly/machine code is out of question in my case. Otherwise I wouldn't need to "relocate" anything and call `__set_errno_internal` directly. – jozols Jul 08 '19 at 15:45
  • @jozols: I still fail to see the problem. You can probably get away with `bx.hi r0`, branching to the absolute address from `r0`. In the ARM calling convention, `r0` is used for the first argument, so you cast `machine_code[]` to a `void(void* target)` and pass the relocated pointer from `dlsym`. – MSalters Jul 08 '19 at 16:17
  • That's similar idea as Jester already showed. It involves modifying my existing assembly code fragments taken from 3rd party source to avoid that relocation. Currently it works by removing instructions which handle `errno` and set it outside assembly after call but it can also work as you and Jester explained - by passing additional parameter to every call. – jozols Jul 09 '19 at 07:17