1

I'm trying to implement some "OSEK-Services" on an arm7tdmi-s using gcc arm. Unfortunately turning up the optimization level results in "wrong" code generation. The main thing I dont understand is that the compiler seems to ignore the procedure call standard, e.g. passing parameters to a function by moving them into registers r0-r3. I understand that function calls can be inlined but still the parameters need to be in the registers to perform the system call.

Consider the following code to demonstrate my problem:

unsigned SysCall(unsigned param)
{
    volatile unsigned ret_val;
    __asm __volatile
    (
        "swi 0          \n\t"    /* perform SystemCall */
        "mov %[v], r0   \n\t"    /* move the result into ret_val */
        : [v]"=r"(ret_val) 
        :: "r0" 
    );

    return ret_val;              /* return the result */
}

int main()
{
    unsigned retCode;
    retCode = SysCall(5); // expect retCode to be 6 when returning back to usermode
}

I wrote the Top-Level software interrupt handler in assembly as follows:

.type   SWIHandler, %function
.global SWIHandler
SWIHandler:

    stmfd   sp! , {r0-r2, lr}        @save regs

    ldr     r0  , [lr, #-4]          @load sysCall instruction and extract sysCall number
    bic     r0  , #0xff000000

    ldr     r3  , =DispatchTable     @load dispatchTable 
    ldr     r3  , [r3, r0, LSL #2]   @load sysCall address into r3 

    ldmia   sp, {r0-r2}              @load parameters into r0-r2
    mov     lr, pc
    bx      r3 

    stmia   sp ,{r0-r2}              @store the result back on the stack
    ldr     lr, [sp, #12]            @restore return address
    ldmfd   sp! , {r0-r2, lr}        @load result into register
    movs    pc  , lr                 @back to next instruction after swi 0

The dispatch table looks like this:

DispatchTable:
    .word activateTaskService
    .word getTaskStateService

The SystemCall function looks like this:

unsigned activateTaskService(unsigned tID)
{
    return tID + 1; /* only for demonstration */
}

running without optimization everything works fine and the parameters are in the registers as to be expected: See following code with -O0 optimization:

00000424 <main>:
 424:   e92d4800    push    {fp, lr}
 428:   e28db004    add fp, sp, #4
 42c:   e24dd008    sub sp, sp, #8
 430:   e3a00005    mov r0, #5          @move param into r0
 434:   ebffffe1    bl  3c0 <SysCall>

000003c0 <SysCall>:
 3c0:   e52db004    push    {fp}        ; (str fp, [sp, #-4]!)
 3c4:   e28db000    add fp, sp, #0
 3c8:   e24dd014    sub sp, sp, #20
 3cc:   e50b0010    str r0, [fp, #-16]
 3d0:   ef000000    svc 0x00000000
 3d4:   e1a02000    mov r2, r0
 3d8:   e50b2008    str r2, [fp, #-8]
 3dc:   e51b3008    ldr r3, [fp, #-8]
 3e0:   e1a00003    mov r0, r3
 3e4:   e24bd000    sub sp, fp, #0
 3e8:   e49db004    pop {fp}        ; (ldr fp, [sp], #4)
 3ec:   e12fff1e    bx  lr

Compiling the same code with -O3 results in the following assembly code:

00000778 <main>:
 778:   e24dd008    sub sp, sp, #8
 77c:   ef000000    svc 0x00000000         @Inline SystemCall without passing params into r0
 780:   e1a02000    mov r2, r0
 784:   e3a00000    mov r0, #0
 788:   e58d2004    str r2, [sp, #4]
 78c:   e59d3004    ldr r3, [sp, #4]
 790:   e28dd008    add sp, sp, #8
 794:   e12fff1e    bx  lr

Notice how the systemCall gets inlined without assigning the value 5 t0 r0.

My first approach is to move those values manually into the registers by adapting the function SysCall from above as follows:

unsigned SysCall(volatile unsigned p1)
{
    volatile unsigned ret_val;
    __asm __volatile
    (
        "mov r0, %[p1]      \n\t"
        "swi 0              \n\t"
        "mov %[v], r0       \n\t" 
        : [v]"=r"(ret_val) 
        : [p1]"r"(p1)
        : "r0"
    );
    return ret_val;
}

It seems to work in this minimal example but Im not very sure whether this is the best possible practice. Why does the compiler think he can omit the parameters when inlining the function? Has somebody any suggestions whether this approach is okay or what should be done differently?

Thank you in advance

  • 2
    You cannot assume that any registers other than those you specify in input and output constraints hold any particular value when an `asm` statement is executed. What you do is clearly misuse of inline assembly. – fuz Mar 29 '20 at 13:29
  • 2
    Please [read the manual](https://gcc.gnu.org/onlinedocs/gcc/Extended-Asm.html) for details. – fuz Mar 29 '20 at 13:31
  • So to fix your problem: provide explicit input and output constraints indicating what registers must hold the value of what variables when the `asm` statement is executed. I think the easiest way to do this is to use [register variables](https://gcc.gnu.org/onlinedocs/gcc/Local-Register-Variables.html) in conjunction witth `r` constraints. – fuz Mar 29 '20 at 13:33
  • @EricPostpischil The manual on register variables I linked specifically mentions that the only supported use is to specify operands to `asm` statements with a granularity finer than what the machine's constraints allow. As far as I know, there are no ARM constraints to specify a specific register, so register variables have to be used. – fuz Mar 29 '20 at 13:39
  • @EricPostpischil “The only supported use for this feature is to specify registers for input and output operands when calling Extended asm (see Extended Asm). This may be necessary if the constraints for a particular machine don’t provide sufficient control to select the desired register. To force an operand into a register, create a local variable and specify the register name after the variable’s declaration. Then use the local variable for the asm operand and specify any constraint letter that matches the register: ...” – fuz Mar 29 '20 at 13:40
  • 2
    for something this trivial why not just use real assembly and the compiler wont get in your way but instead help you. – old_timer Mar 29 '20 at 16:34
  • Thats actually a good question.. For some reason I couldnt think of a way how to do that.. but now I found a way – UltraGeralt Mar 29 '20 at 17:00

1 Answers1

6

A function call in C source code does not instruct the compiler to call the function according to the ABI. It instructs the compiler to call the function according to the model in the C standard, which means the compiler must pass the arguments to the function in a way of its choosing and execute the function in a way that has the same observable effects as defined in the C standard.

Those observable effects do not include setting any processor registers. When a C compiler inlines a function, it is not required to set any particular processor registers. If it calls a function using an ABI for external calls, then it would have to set registers. Inline calls do not need to obey the ABI.

So merely putting your system request inside a function built of C source code does not guarantee that any registers will be set.

For ARM, what you should do is define register variables assigned to the required register(s) and use those as input and output to the assembly instructions:

unsigned SysCall(unsigned param)
{
    register unsigned Parameter __asm__("r0") = param;
    register unsigned Result    __asm__("r0");
    __asm__ volatile
    (
        "swi 0"
        : "=r" (Result)
        : "r"  (Parameter)
        : // "memory"    // if any inputs are pointers
    );
    return Result;
}

(This is a major kludge by GCC; it is ugly, and the documentation is poor. But see also https://stackoverflow.com/tags/inline-assembly/info for some links. GCC for some ISAs has convenient specific-register constraints you can use instead of r, but not for ARM.) The register variables do not need to be volatile; the compiler knows they will be used as input and output for the assembly instructions.

The asm statement itself should be volatile if it has side effects other than producing a return value. (e.g. getpid() doesn't need to be volatile.)

A non-volatile asm statement with outputs can be optimized away if the output is unused, or hoisted out of loops if its used with the same input (like a pure function call). This is almost never what you want for a system call.

You also need a "memory" clobber if any of the inputs are pointers to memory that the kernel will read or modify. See How can I indicate that the memory *pointed* to by an inline ASM argument may be used? for more details (and a way to use a dummy memory input or output to avoid a "memory" clobber.)

A "memory" clobber on mmap/munmap or other system calls that affect what memory means would also be wise; you don't want the compiler to decide to do a store after munmap instead of before.

Eric Postpischil
  • 195,579
  • 13
  • 168
  • 312
  • Thank you very much, it seems to work! Never thought about using the register keyword... – UltraGeralt Mar 29 '20 at 14:43
  • @UltraGeralt: you almost certainly do want `asm volatile`; see my edit to this answer. Unless the system call was `getpid` or something that can be optimized away if the return value is unused, unlike `write()`. – Peter Cordes Mar 29 '20 at 21:12
  • 1
    @PeterCordes: `getpid` requires either `asm volatile` or a `"memory"` or it may get reordered past `fork()`. https://godbolt.org/z/a1rq7GP3e – Timothy Baldwin Jan 12 '23 at 22:40
  • @TimothyBaldwin: Oh right, great point that even `getpid()` doesn't always return the same value, so it's not a pure function within the context of one C function. `volatile` is the right tool to tell the compiler about that fact. As for `fork` itself, probably `volatile` and `"memory"` clobber just to err on the side of caution. – Peter Cordes Jan 13 '23 at 01:20