0

I've inherited a x86 MSVC assembly piece which calls a C++ class function, passing a varying number of parameters, anywhere from 0 to 16 parameters. These parameters are guaranteed to be int, float, or char *. Likewise for returning, it's always one of those three types.

This is for an Android NDK shared library, targeting Android API 19 or greater. I'm trying to achieve maximum compatibility in that regard.

I currently have this code for x86, which I over-documented:

void * Extension;   // Class to call on (of type Extension *)
void * Function;    // Class function to invoke on (&Extension::XX)
int ParameterCount; // from 0 through 16
int * Parameters;   // Pre-initialised to alloca() array, with parameters already set pre-ASM block
int Result = 0;     // Output here
__asm
{
    pushad                  ; Start new register set (do not interfere with already existing registers)
    mov ecx, ParameterCount ; Store ParameterCount in ecx
    cmp ecx, 0              ; If no parameters, call function immediately
    je CallNow

    mov edx, Parameters     ; Otherwise store Parameters in edx
    mov ebx, ecx            ; Copy ecx, or ParameterCount, to ebx
    shl ebx, 2              ; Multiply parameter count by 2^2 (size of 32-bit variable)
    add edx, ebx            ; add (ParameterCount * 4) to Parameters, making edx point to Parameters[param count]
    sub edx, 4              ; subtract 4 from edx, making it 0-based (ending array index)
    PushLoop:
        push [edx]          ; Push value pointed to by Parameters[edx]
        sub edx, 4          ; Decrement next loop`s Parameter index:    for (><; ><; edx -= 4)
        dec ecx             ; Decrement loop index:                     for (><; ><; ecx--)
        cmp ecx, 0          ; If ecx == 0, end loop:                    for (><; ecx == 0; ><)
        jne PushLoop        ; Optimisation: "cmp ecx, 0 / jne" can be replaced with "jcxz"
    CallNow:
    mov ecx, Extension      ; Move Extension to ecx
    call Function           ; Call the function inside Extension
    mov Result, eax         ; Function`s return is stored in eax; copy it to Result
    popad                   ; End new register set (restore registers that existed before popad)
}

While I understand the x86, I'm now porting it to Android NDK. That means armeabi, armeabi-v7a, and trying to use Clang's __asm__ instead of Visual Studio's __asm. Frankly put, I have no idea where to start.

__asm__ volatile("pushad            \t\n\
    mov %%ecx, %[ParameterCount]    \t\n\
    cmp %%ecx, $0                   \t\n\
    je CallNow                      \t\n\
    mov %%edx, %[Parameters]        \t\n\
    mov %%ebx, %%ecx                \t\n\
    shl %%ebx, $2                   \t\n\
    add %%edx, %%ebx                \t\n\
    sub %%edx, $4                   \t\n\
    PushLoop:                       \t\n\
        push[%%edx]                 \t\n\
        sub %%edx, $4               \t\n\
        dec %%ecx                   \t\n\
        cmp %%ecx, $0               \t\n\
        jne PushLoop                \t\n\
    CallNow:                        \t\n\
        mov %%ecx, %[Extension]     \t\n\
        call %[Function]            \t\n\
        mov %[Result], %%eax        \t\n\
        popad"
    // outputs, memory?
    : [Result] "=m" (Result)
    // inputs, "r" indicates read, [x] indicates the ASM will reference it by %[x]
    : [Extension] "r" (Extension), [Parameters] "r" (Parameters), [Function] "r" (Function), [ParameterCount] "r" (ParameterCount));

I'm getting unexpected tokens and register problems all over the place. I looked up some articles but according to this article, function calls differ per device - and they differ per number of parameters, both of which is a problem.

The NDK DLL may be called often and all communications ultimately pass through that ASM. So this is a make-or-break thing.

Phi
  • 467
  • 5
  • 16
  • 1
    What is the point of this assembly code? Are you trying to call a C++ instance method from C? Also, could you add all the error messages to your question? – Michael Apr 25 '18 at 12:13
  • No, I'm trying to call a C++ function code from C++, but the signature varies, so the way it was solved before was reproducing the assembly code to push all the parameters and call it directly. Since there's an error for nearly all the lines of assembly, I think it'd be apparent where the problem lies with people who know ARM. – Phi Apr 25 '18 at 12:16
  • 4
    You know that ARM assembly is totally different from x86, right? It's not something we can explain in a few sentences. Find somebody to write it for you if you can't do it. – Jester Apr 25 '18 at 12:16
  • _"No, I'm trying to call a C++ function code from C++, but the signature varies"_ Could you show an example of some methods that are uncallable the normal way? – Michael Apr 25 '18 at 12:19
  • @Michael: They're callable, but only if you refer to them by name, like Extension::RandomThing(int i, int j). These functions are all stored as void *, so the original function name and argument list isn't present. – Phi Apr 25 '18 at 12:21
  • Also note that depending on calling convention you might need to know which arguments are floats. PS: it would probably be simpler to have a `switch` in C. – Jester Apr 25 '18 at 12:29
  • @Jester: Yes, I'm aware they're different. Since I have to maintain it, I figured it would be best to ask a community of experts for any tips or standardised way of doing it, rather than hodge-podging something together from a field I'm new to and have it crash a device down the road. I wasn't aware the answer fields were restricted to a few sentences. If that's all the time you can spare, feel free to move to an easier question. – Phi Apr 25 '18 at 12:33
  • 1
    *These parameters are guaranteed to be int, float, or `char *`*. This is not simple either; ARM32 passes the first 4 integer/pointer args in registers, but not `float`, if you build with `-mfloat-abi=hard` (notice how the 3rd arg goes in a different register when the 2nd arg is FP https://godbolt.org/g/9GrSyP). *You need to know the arg types to get this right* (unless Android uses software floating point?) Or does the callee know it's a variadic function? On ARM that changes the calling convention: https://godbolt.org/g/AYmYDM – Peter Cordes Apr 25 '18 at 12:34
  • @Jester: I was asking for things that people more familiar with ARM would know could trip others up, like number of arguments being above 4 must be handled differently, calling conventions apparently changing. Being told to find someone to write it for me rather than "here's how you can change your question to suit the site better" seems to be unhelpful. – Phi Apr 25 '18 at 12:43
  • Seeing how you just copied the x86 assembly code, it wasn't clear if you knew that even the instructions are totally different on ARM. PS: even the x86 code is broken, it can not handle float returns. – Jester Apr 25 '18 at 12:44
  • Ok, so the function definitions aren't variadic, so the calling conventions will be like my first Godbolt link, not my 2nd. Definitely harder, and you probably need to know which args are `float`. (At least you don't have to promote them to `double` like for variadic functions.) – Peter Cordes Apr 25 '18 at 12:47
  • @PeterCordes If float changes parameters, that'll make things interesting... I could add in another stack array for indicating float or not for the parameters, what do you suggest? – Phi Apr 25 '18 at 12:49
  • 3
    @Phi: I suggest not doing this at all, it sounds horrible. Are you sure you can't cast your function pointers to the right type and make a normal call from C++ once you know which args you want to pass? – Peter Cordes Apr 25 '18 at 12:52
  • 1
    @PeterCordes It's definitely horrible.The objective is deviating from an app-invoked function call (which passes a function ID), filling a list of args, and invoking the function that corresponds with the ID and passing those args. The arguments are loaded by repeatedly querying the app invoking the NDK to get the next one. If there's C++ template magic I could use to store a vector of the functions regardless of their signature, then I'd happily use that instead of ASM. – Phi Apr 25 '18 at 12:58
  • @Jester: forgive my snippishness, this problem is just one that's completely out of my field. I'm writing it for a community, open-source as well, and the last thing I want is to release something and have random devices fail. I don't know how much damage you can do with bad ASM, but I know it's a lot worse than bad C++. That's exactly why I'm asking for help, so getting someone denying help is sentencing me for days of study for twenty-odd lines of ASM and even then, it could fail in unknown circumstances. I just can't afford it. – Phi Apr 25 '18 at 13:10

1 Answers1

0

I solved a similar problem few years ago. But my task was just to call C functions not C++ methods. But it shouldn't be a problem to update my code.

So, my code is here. Please feel free to change and use it.

const int ARGC = 32;
const int ARGC_BOUNDS = 4;

HRESULT er = S_OK;

int argv[ARGC] = {0};       
int arg_pointer = 0;
int arg_stack_count = 0;

//////
//
// ..... fill argv with arguments
//
//////

// how many arguments will be placed on stack
if (arg_pointer > 4) {
    arg_stack_count = arg_pointer - 4;
}

// build stack, fill registers and call functions  
// ! volatile ... otherwise compiler "optimize out" our ASM code
__asm__ volatile (
    "mov r4, %[ARGV]\n\t"   // remember pointers (SP will be changed)
    "ldr r5, %[ACT]\n\t"    
    "ldr r0, %[CNT]\n\t"    // arg_stack_count  => R0
    "lsl r0, r0, #2\n\t"    // R0 * 4           => R0
    "mov r6, r0\n\t"        // R4               => R6           
    "mov r1, r0\n"          // arg_stack_count  => R1           
"loop: \n\t"
    "cmp r1, #0\n\t"
    "beq end\n\t"           // R1 == 0      => jump to end
    "sub r1, r1, #4\n\t"    // R1--
    "mov r3, r4\n\t"        // argv_stack   => R3
    "add r3, r3, #16\n\t"
    "ldr r2, [r3, r1]\n\t"  // argv[r1]
    "push {r2}\n\t"         // argv[r1] => push to stack
    "b loop\n"              //          => repeat
"end:\n\t"
    "ldr r0, [r4]\n\t"      // 1st argument
    "ldr r1, [r4, #4]\n\t"  // 2nd argument
    "ldr r2, [r4, #8]\n\t"  // 3rd argument
    "ldr r3, [r4, #12]\n\t" // 4th argument
    "blx r5\n\t"            // call function
    "add sp, sp, r6\n\t"    // fix stack position
    "mov %[ER], r0\n\t"     // store result
: [ER] "=r"(er)
: [ARGV] "r" (argv),
  [ACT] "m"(Action),
  [CNT] "m" (arg_stack_count)
: "r0", "r1", "r2", "r3", "r4", "r5", "r6");

return er;

Bear in mind that Google will require all Android apps to be 64-bit beginning in August 2019. So, it would be wiser to rewrite your application because in one year you will have to rewrite it again.

zdenek
  • 21,428
  • 1
  • 12
  • 33
  • You don't need to `mov r4, %[ARGV]`, you could just use `%[ARGV]` everywhere you used `r4` because you already asked the compiler to put it in a register for you. The other inputs could also be registers. (Use `register int arg_stack_count asm("r0")` to make sure you get it in `r0` from a `"+r"` constraint if you want that. `+r` so you can modify it.) – Peter Cordes Apr 25 '18 at 12:39
  • Thanks zdenek, that looks good. This ASM is for a NDK shared library, so I don't think the 64-bit thing will apply. (Another dev did warn me about it beforehand.) – Phi Apr 25 '18 at 12:40
  • Semi-related: the equivalent of this for the x86-64 Windows calling convention (where you don't need to know types) [How to set function arguments in assembly during runtime in a 64bit application on Windows?](//stackoverflow.com/a/49375333). – Peter Cordes Apr 25 '18 at 12:49
  • @Phi: this doesn't handle FP args at all. According to this (https://android.googlesource.com/platform/ndk/+/353e653824b79c43b948429870d0abeedebde386/docs/HardFloatAbi.md), the hard-float ABI isn't used on any devices, so maybe you can get away with this. Check what your compiler does when you compile normal code with float args, and see if it passes them in FP registers `s0` or `r0` integer. – Peter Cordes Apr 25 '18 at 12:56
  • @PeterCordes Asides from that, any other issues that could appear here? As mentioned in the question, what if the return type is float, or char *? Could that break things? – Phi Apr 25 '18 at 13:11
  • @Phi: `char*` is always just an integer returned in `r0`. If you're using a soft-float ABI, I think `float` is also returned in `r0`. – Peter Cordes Apr 25 '18 at 13:19
  • @PeterCordes Also, I'm invoking this on a Extension class instance, do I just make the first parameter the Extension *? – Phi Apr 25 '18 at 13:38
  • @Phi: As the `this` pointer? Yeah probably; check on Godbolt to see what code a compiler generates when it can see the proper class method declaration and you use it directly. – Peter Cordes Apr 25 '18 at 13:40
  • @PeterCordes Your code works perfectly. Turns out the `this` does need adding as the first parameter, simple enough to do. Long shot, but do you have anything for ARM64 or x86 clang before I accept the answer? – Phi Jun 02 '18 at 04:52
  • @Phi: I recommend avoiding making function calls from inline asm. It's a recipe for trouble, because e.g. you [can't clobber the red-zone on x86-64 System V](https://stackoverflow.com/questions/39160450/how-do-i-tell-gcc-that-my-inline-assembly-clobbers-part-of-the-stack), which you could work around with `add $-128, %rsp` before pushing args... but also because the compiler doesn't know about all the callers of a function if it can't see the calls. So no, I don't have anything, other than the x86-64 Windows version I linked above. (IIRC there was a link to i386 stack-args from there.) – Peter Cordes Jun 02 '18 at 06:41
  • If you only need integer / pointer args, not FP or structs, then most of the calling conventions are pretty simple to write wrappers like this for. But needing to do this in the first place is probably a sign of an ugly design should use a layer of indirection to a bundle of args, rather than unpacking whatever you want to pass into the target platform's C calling convention. – Peter Cordes Jun 02 '18 at 06:43
  • @PeterCordes It turns out the line `ldr r3, [r4, #12]` is ARMv5 only, do you know what the ARMv7 equivalent is? (Sorry for the out-of-the-blue question) – Phi May 17 '21 at 00:26
  • @Phi: I'm pretty sure that is valid ARMv7 syntax. What makes you think it isn't? it assembles just fine with `arm-none-eabi-gcc -mcpu=cortex-a57 -c foo.s`, into `e594300c ldr r3, [r4, #12]` in ARM mode. Or to `68e3` in thumb mode. – Peter Cordes May 17 '21 at 00:55
  • @PeterCordes it might be the following line, `blx r5`, and I get "error : instruction requires: armv5t" while compiling with Clang 5.0. I'm using thumb mode ARM, if that applies. – Phi May 17 '21 at 01:11
  • @Phi: That assembles just fine for any ARMv5t or later target, including ARMv7. Make sure you're using options to tell your compiler and/or assembler what CPU to target. – Peter Cordes May 17 '21 at 01:26
  • @PeterCordes Thanks, very strange behaviour for VS targeting Android to not target properly, but I'll poke around. – Phi May 17 '21 at 01:49
  • @PeterCordes I'm trying to build for ARMv7 iOS now, using Apple Clang 13.0, and getting a warning "inline asm clobber list contains r6", any ideas why or how to fix it? – Phi Dec 17 '21 at 20:37
  • 1
    IDK, does the ABI for that target use `r6` in some special / fixed way, like a pointer to thread-local data or something? I don't know details of that target, but presumably there's some reason Apple Clang for that target chooses to warn for that reg specifically. Usually best to use `"r"` constraints for inputs to let the compiler pick. (Or for scratch regs, dummy `"=r"` outputs. With Thumb, you may need to use `"l"` / `"=l"` if you need it to pick one of r0..7, not a "high" register. https://gcc.gnu.org/onlinedocs/gcc/Machine-Constraints.html) – Peter Cordes Dec 18 '21 at 00:49