0

Assembly language is giving me headaches.

Is there an easy way to identify arguments and variables with assembly language?In the example below how can I figure oout the value(s) that prairieDog() obtains as arguments from main()? also, how can I identify the registers where they passed?

I am having a hard time understand how to identify How many local variables does prairieDog() example have? and where are these local variables the stack?

Dump of assembler code for function prairieDog:

(gdb) disass prairieDog
Dump of assembler code for function prairieDog:
   0x0000000000400506 <+0>:     push   %rbp
   0x0000000000400507 <+1>:     mov    %rsp,%rbp
   0x000000000040050a <+4>:     sub    $0x18,%rsp
   0x000000000040050e <+8>:     mov    %edi,-0x14(%rbp)
   0x0000000000400511 <+11>:    mov    %esi,-0x18(%rbp)
   0x0000000000400514 <+14>:    movl   $0x0,-0x4(%rbp)
   0x000000000040051b <+21>:    mov    -0x14(%rbp),%eax
   0x000000000040051e <+24>:    mov    %eax,-0x8(%rbp)
   0x0000000000400521 <+27>:    jmp    0x400534 <prairieDog+46>
   0x0000000000400523 <+29>:    mov    -0x8(%rbp),%eax
   0x0000000000400526 <+32>:    mov    %eax,%edi
   0x0000000000400528 <+34>:    callq  0x4004ed <meerkat>
   0x000000000040052d <+39>:    add    %eax,-0x4(%rbp)
   0x0000000000400530 <+42>:    addl   $0x2,-0x8(%rbp)
   0x0000000000400534 <+46>:    mov    -0x8(%rbp),%eax
   0x0000000000400537 <+49>:    cmp    -0x18(%rbp),%eax
   0x000000000040053a <+52>:    jle    0x400523 <prairieDog+29>
   0x000000000040053c <+54>:    mov    -0x4(%rbp),%eax
   0x000000000040053f <+57>:    leaveq
   0x0000000000400540 <+58>:    retq
End of assembler dump.
  • 4
    This seems to be a homework question, which requires an attempt. Please read [How do I ask and answer homework questions?](https://meta.stackoverflow.com/q/334822). – Thomas Jager Jul 11 '20 at 16:49
  • From just the assembly of a C function, you can't tell how many arguments it has, just how many it uses. For example, `int f(void) { return 42; }` and `int f(int x, int y, int z) { return 42; }` will both become the exact same assembly. And local variables are completely lost. `int f(void) { int x = 42; return x; }` will also become the exact same assembly as the other two functions. – Joseph Sible-Reinstate Monica Jul 11 '20 at 17:46
  • 1
    Your assembly looks like it's almost the SysV ABI, but not quite, since it breaks the rule about stack alignment at function calls. – Joseph Sible-Reinstate Monica Jul 11 '20 at 18:07
  • Hint: This is almost exactly what you get from GCC with no optimizations for a simple C function (the difference being the stack offsets). – Joseph Sible-Reinstate Monica Jul 11 '20 at 18:24

2 Answers2

2

In assembly you have so called calling conventions, these are a set of rules, how parameters are passed. For example: On Linux (x86_64), the first 6 integer arguments are passed in %rdi, %rsi, %rdx, %rcx, %r8, %r9. The first 8 floating point arguments in %xmm0-%xmm7. Every other argument is passed on the stackReference

Is there an easy way to identify arguments and variables with assembly language? You can recognize arguments to a function, if they are accessed from the matching register. Variables can be quite difficult, some may be on the stack, some only in registers.

I think, with these both tools, you should be able to solve your problem yourself.

JCWasmx86
  • 3,473
  • 2
  • 11
  • 29
1

What you do is analyze the code and look for variables.  Variables occupy storage.

There's two kinds of storage: registers and memory locations.

(It's sometimes bit tricky because in assembly language / machine code, storage can be repurposed.)

As you're identifing the variables, you split them into two categories:

  • storage that is used/consumed/sourced without or before being defined, and,
  • storage that is defined/set/targeted before being used.

The ones that are being used before defined are the parameter variables (or possibly uninitialized variables, which is a logic error so is seldom seen).

The ones that are being defined before used are local variables.

As your homework is asking about parameters you can ignore that category.

At +8, you see a mov %edi,-0x14(%rbp).  This instruction references two items of storage, and, as this is AT&T syntax, the first operand is sourced (used) and the second targeted (defined).

The first operand, %edi is a register that is being sourced.  As this is the first mention of the xdi register, this register is storage that is being used without being defined — that makes it a parameter variable that must have been defined (passed) by the caller.

The next storage is -0x14(%rbp), a memory location.  It is targeted (set/defined) here — and as it is being defined before used, that makes it a local variable.  We can also note that the memory location is below (with a negative offset) the frame pointer, %rbp; -0x14(%rbp) is newly allocated, and thus uninitialized (until this instruction), which is also the expected usage pattern for local variables.

Erik Eidt
  • 23,049
  • 2
  • 29
  • 53