0

I'm having trouble understanding the (erroneous) output of the following assembly code I've generated through a compiler I'm writing.

This is the pseudo-code of what I'm compiling:

int sidefx ( ) {
     a = a + 1;
     printf("side effect: a is %d\n", a);
     return a;
}

void threeargs ( int one, int two, int three ) {
     printf("three arguments. one: %d, two: %d, three: %d\n", one, two, three);
}

void main ( ) {
     a = 0;
     threeargs(sidefx(), sidefx(), sidefx());
}

Here's the assembly code I've generated:

.section .rodata
.comm global_a, 8, 8
.string0:
.string "a is %d\n"
.string1:
.string "one: %d, two: %d, three: %d\n"
.globl main

sidefx:                /* Function sidefx() */
enter $(8*0),$0        /* Enter a new stack frame */
movq global_a, %r10    /* Store the value in .global_a in %r10 */
movq $1, %r11          /* Store immediate 1 into %r11 */
addq %r10,%r11         /* Add %r10 and %r11 */
movq %r11, global_a    /* Store the result in .global_a */
movq global_a, %rsi    /* Put the value of .global_a into second paramater register */
movq $.string0, %rdi   /* Move .string0 to first parameter register */
movq $0, %rax        
call printf            /* Call printf */
movq global_a, %rax    /* Return the new value of .global_a */
leave                  /* Restore old %rsp, %rbp values */
ret                    /* Pop the return address */

threeargs:             /* Function threeargs() */
enter $(8*0),$0        /* Enter a new stack frame */
movq %rdx, %rcx        /* Move 3rd parameter register value into 4th parameter register */
movq %rsi, %rdx        /* move 2nd parameter register value into 3th parameter register */ 
movq %rdi, %rsi        /* Move 1st parameter register value into 2nd parameter register */
movq $.string1, %rdi   /* Move .string1 to 1st parameter register */
movq $0, %rax
call printf            /* call printf */
leave                  /* Restore old %rsp, %rbp values */
ret                    /* Pop the return address */

main:
enter $(8*0),$0        /* Enter a new stack frame */
movq $0, global_a      /* Set .global_a to 0 */
movq $0, %rax          
call sidefx            /* Call sidefx() */
movq %rax,%rdi         /* Store value in %rdi, our first parameter register */
movq $0, %rax
call sidefx            /* Call sidefx() */
movq %rax,%rsi         /* Store value in %rsi, our second parameter register */
movq $0, %rax
call sidefx            /* Call sidefx() */
movq %rax,%rdx         /* Store value in %rdx, our third parameter register */
movq $0, %rax
call threeargs         /* Call threeargs() */

main_return:
leave
ret

Now here's what I don't understand. The output to the program when compiled (gcc file.s -o code && ./code) is the following :

dmlittle$ gcc file.s -o code && ./code 
a is 1
a is 2
a is 3
one: 1, two: 2147483641, three: 3

The problem with the assembly code is that I'm storing the values of the sidefx() call that will eventually be parameters to threeargs() into the function registers, but the 2 succeeding calls to sidefx() will overwrite the values of %rdi and %rsi in order to call printf. In order to fix this problem I need to store the return values either somewhere in the stack or maybe in callee-saved registers.

Why is the final printf returning one: 1, two: 2147483641, three: 3 ? Shouldn't the first number printed also be mangled like what happened to the second number due to the succeeding sidefx calls?

Peter Cordes
  • 328,167
  • 45
  • 605
  • 847
dmlittle
  • 964
  • 6
  • 15
  • You do realize that you have failed to initialize 'a' within the functions or even to define it as an integer for that matter? I'm pretty surprised that gcc will compile that code. – David Hoelzer Apr 05 '16 at 18:58
  • @DavidHoelzer `a` is a global variable and it's declared in the `.comm` section – dmlittle Apr 05 '16 at 18:59
  • I really cannot see that from the code fragment that you've posted. – David Hoelzer Apr 05 '16 at 18:59
  • 1
    `.comm global_a, 8, 8` (second line in the assembly code). I just print 'a' in the string that is passed to printf but it's stored as `.global_a` in the `.comm` section – dmlittle Apr 05 '16 at 19:01
  • 1
    That's fine... but please don't make us read your assembly code to make sense of your C code. – David Hoelzer Apr 05 '16 at 20:24

1 Answers1

4

You didn't specify which x86-64 ABI you're using, but from your use of %rdi / %rsi for arg passing, I'll assume you're targeting the SysV ABI (everything except windows). See the wiki for links to docs and stuff.


... clobbering of return values from first two sidefx() calls... In order to fix this problem I need to store the return values either somewhere in the stack or maybe in callee-saved registers.

That's correct. gcc prefers using call-preserved regs, because then you don't have to fiddle with the stack alignment when pushing or popping between calls.

Why is the final printf returning one: 1, two: 2147483641, three: 3? Shouldn't the first number printed also be mangled like what happened to the second number due to the succeeding sidefx calls?

It's just a coincidence that %rdi=1 when you call threeargs(). If you single-step your code, you'd probably find it happens to have that value when printf returns. It's not from saving/restoring, since the original value is destroyed by movq $.string1, %rdi before the call to printf. It just happens that 1 is a common thing to find in a register.

Best guess: 1 is the file-descriptor arg to the write(2) system call, which is the last thing printf needed to do before returning. (Because stdout is line-buffered).


Your C doesn't match your implementation. In the asm, global_a is 8 bytes, but in C you're treating it as a 4 byte integer (printing with %d, not %ld). Your C doesn't declare it at all. I was going to edit in a declaration into the question, but you should resolve the ambiguity yourself (between long global_a = 0; or int global_a = 0;). The AMD64 SysV ABI specifies that long is 8 bytes. Use int64_t whenever you're writing portable C, though. There's no harm in writing int64_t when interoperating with asm, even when you do happen to know the sizes of short, int and long in the ABI you're using.


Avoid the enter instruction, unless you only care about code size, and not speed. It's horribly slow. leave is ok, maybe slower than mov %rbp, %rsp / pop %rbp, but usually you only need pop %rbp because you either didn't modify %rsp, or you needed to restore rsp anyway with add $something, %rsp before popping some other registers that you saved after %rbp.

Zeroing 64bit registers with xor %eax,%eax (2 bytes) has many advantages beyond code-size over mov $0, %rax (7 bytes: mov $sign-extended-imm32, r64).


Compare your code with compiler output: gcc -fverbose-asm -O3 -fno-inline will actually generate code from your C; all you need is a declaration of a, and to make main return an int, and it compiles just fine as C11. Of course, it mostly uses 32bit operand size because you used int, but the data movement (which thing goes in which register) is the same.

Also, the order of evaluation of argument lists is not specified, so threeargs(sidefx(), sidefx(), sidefx()) is undefined behaviour. You have multiple expressions with side effects with no sequence points separating them. I guess this is why you called it pseudo-code, not C, but it's poor way to express what you mean.

Anyway, here's your code on the Godbolt Compiler Explorer from gcc 5.3 -O3.

threeargs uses a jmp to tail-call printf, instead of call/ret.

The significant differences in main are all about correctly saving the return values from sidefx. Note that a=0 in main is not needed, because it's already initialized to zero by being in the BSS, but with -fwhole-program, gcc can't optimize it away. (A constructor could modify a before main runs, or maybe after linking a different definition of a could be used, that has a different initializer.)

The implementation of sidefx is noticeably tighter than yours:

sidefx:
    subq    $8, %rsp        #     aligns the stack for another function call
    movl    a(%rip), %eax   # a, tmp94      # load `a`
    movl    $.LC0, %edi     #,              # the format string
    leal    1(%rax), %esi   #, D.2311       # esi = a+1
    xorl    %eax, %eax      #               # only needed because printf is a varargs function.  Your `main` is doing this unnecessarily.
    movl    %esi, a(%rip)   # D.2311, a     # store back to the global
    call    printf  #
    movl    a(%rip), %eax   # a,            # reload a
    addq    $8, %rsp        #,
    ret

IDK why gcc didn't load into %esi in the first place, and inc %esi instead of using lea to add one and store in a different dest. Your version moves an immediate 1 into a register, which is silly. Use immediate operands, and lea. The CPU designers already paid the x86 tax (extra design complexity to support the CISC instruction set), make sure you get your money's worth by taking full advantage of lea and immediate operands.

Note that it doesn't store/reload a before the call to printf. Your version doesn't need to do that.

Also note that none of the functions waste instructions making stack frames.

Community
  • 1
  • 1
Peter Cordes
  • 328,167
  • 45
  • 605
  • 847
  • 1
    Thanks. With respect to the code, this isn't C syntax but rather a made-up language spec that we were given with only 64-bit integers. I did forget to include the initialization of the variable. – dmlittle Apr 05 '16 at 21:11
  • @dmlittle: Then it's still C *syntax*, but not actually C. That explains a lot. Add a `typedef` or `#define int long` to the code in the godbolt link and you'll get 64bit. (And fix the format strings.) Note that your ASM *does* need to use actual C-compatible format strings, because it's calling the actual `printf(3)`, not a made-up `printf` where `%d` takes a 64bit integer. This will matter if your code is used with integers that don't fit in 32 bits. – Peter Cordes Apr 05 '16 at 21:25