2

I have the following C code:

#include <stdio.h>

void function(int a, int b, int c) {
  int buff_1[5];
  int buff_2[10];

  buff_1[0] = 6;
  buff_2[0] = 'A';
  buff_2[1] = 'B';
}

int main(void) {
  int i = 1;
  function(1,2,3);
  return 0;
}

now I want to analyze the associated assembly code: The assembly instructions before the function call, according to the book I'm reading are:

pushl $3
pushl $2
pushl $1
call function

The underlying object file was created using gcc-5.3 -O0 -c functions.c. However, if I create the assembly code using objdump I get the following instructions:

movl $3, %edx
movl $2, %esi
movl $1, %edi

As far as I understand assembly (I'm pretty new to it) the first one makes more sense to me.

Is the book simply wrong? Or is the books output just outdated because of using gcc 2.9

hGen
  • 2,235
  • 6
  • 23
  • 43
  • 1
    Does gcc have a specific intermediate representation or byte code? Your 2nd example looks like target specific assembly. I.e. not every processor will have those registers, those are x86 specific (I might be wrong, I don't usually deal with low level stuff). Also, might be an optimization not all turned off with "-O0", constructor elision is one I know which is not. Possibly related: http://stackoverflow.com/questions/4534791/why-does-it-use-the-movl-instead-of-push – luk32 May 11 '16 at 12:30
  • 7
    It'll just be a different calling convention - passing the first N arguments in registers rather than on the stack. However using those registers is not a convention I recognise, nor is it on [the Wikipedia page](https://en.wikipedia.org/wiki/X86_calling_conventions). Is this a 64-bit PC? Try compiling with `-m32` maybe to get the 32-bit conventions, which I think are stack by default. – Rup May 11 '16 at 12:32
  • 4
    It is just using x86_64 calling convention using the lower half of registers RDI, RSI and RDX (which are the first 3 resisters in the x86_64 calling convention) – David C. Rankin May 11 '16 at 12:40
  • 2
    "Is this a repost of: why does it use the movl instead of push?" - no, it's not. That question deals with GCC 3.x's way of pushing arguments onto the stack by reserving some space then writing into that space, rather than using push instructions. David is correct: this is because you're compiling for 64-bit code, and this is the standard 64-bit Linux calling convention. Try `-m32` to get 32-bit code that'll look like your book, or at least like the code in that linked question which is equivalent to what's in your book (this part is gcc 2.95 vs more modern GCCs). – Rup May 11 '16 at 12:51
  • there is no reason to expect any two compilers nor the same compiler with different settings to produce the same output. certainly not two versions a year apart and definitely not two versions 10+ years apart. Nor is there any reason to expect a book to match a compiler, even if they call out the compiler and version. – old_timer May 11 '16 at 12:56
  • you can certainly run an old linux in a virtual machine and directly or cross compile 2.9 and see how it compares. – old_timer May 11 '16 at 12:58
  • rather than compiling (and learning) an old version, what about taking a more up-to-date book, and learn the way it's done now? – Tommylee2k May 11 '16 at 13:36
  • 2
    The [**GCC 2.95.X Releases**](https://gcc.gnu.org/releases.html) occurred from July 1999 - March 2001 and are quite dated. x86_64 was in its infancy then. See [**x86_64**](https://en.wikipedia.org/wiki/X86-64), so it appears code generation was described using the x86_64 calling convention but either using the lower-half of the registers -- or simply mistakenly using the x86 register names. Nothing wrong with learning history, but if you intend to work with current code, you need a whole lot newer reference. – David C. Rankin May 11 '16 at 14:15

1 Answers1

5

The book is out of date with respect to 64-bit x86. The x86-64 calling conventions per Wikipedia are:

System V AMD64 ABI

The calling convention of the System V AMD64 ABI is followed on Solaris, Linux, FreeBSD, OS X, and other UNIX-like or POSIX-compliant operating systems. The first six integer or pointer arguments are passed in registers RDI, RSI, RDX, RCX (R10 in the Linux kernel interface), R8, and R9, while XMM0, XMM1, XMM2, XMM3, XMM4, XMM5, XMM6 and XMM7 are used for certain floating point arguments. As in the Microsoft x64 calling convention, additional arguments are passed on the stack and the return value is stored in RAX.

Since you're passing 32-bit values, gcc is using the lower half of each register, hence %edi, %esi, and %edx.

John Bode
  • 119,563
  • 19
  • 122
  • 198
  • 1
    Minor addition: It is actually using the **whole** registers (`%rdi`, `%rsi`, ...). Moving into a 32 bits GP register zero extends to the whole 64 bits register. Those instruction are the efficient way to do `mov $3, %rdx` and so on. – Margaret Bloom May 11 '16 at 14:47
  • 2
    @MargaretBloom: Gotcha. I don't normally work at this level, so I'm not aware of all the nuances. Feel free to edit the answer if you feel it's necessary. – John Bode May 11 '16 at 14:52