4

I'm wondering how a struct is returned in something like:

typedef struct number {
   uint64_t a, b, c, d;
}number;


number get_number(){
   number res = {0,0,0,0};
   return res;
}

which disassembles to

0000000000001149 <get_number>:
    1149:   55                      push   rbp
    114a:   48 89 e5                mov    rbp,rsp
    114d:   48 89 7d d8             mov    QWORD PTR [rbp-0x28],rdi
    1151:   48 c7 45 e0 00 00 00    mov    QWORD PTR [rbp-0x20],0x0
    1158:   00
    1159:   48 c7 45 e8 00 00 00    mov    QWORD PTR [rbp-0x18],0x0
    1160:   00
    1161:   48 c7 45 f0 00 00 00    mov    QWORD PTR [rbp-0x10],0x0
    1168:   00
    1169:   48 c7 45 f8 00 00 00    mov    QWORD PTR [rbp-0x8],0x0
    1170:   00
    1171:   48 8b 4d d8             mov    rcx,QWORD PTR [rbp-0x28]
    1175:   48 8b 45 e0             mov    rax,QWORD PTR [rbp-0x20]
    1179:   48 8b 55 e8             mov    rdx,QWORD PTR [rbp-0x18]
    117d:   48 89 01                mov    QWORD PTR [rcx],rax
    1180:   48 89 51 08             mov    QWORD PTR [rcx+0x8],rdx
    1184:   48 8b 45 f0             mov    rax,QWORD PTR [rbp-0x10]
    1188:   48 8b 55 f8             mov    rdx,QWORD PTR [rbp-0x8]
    118c:   48 89 41 10             mov    QWORD PTR [rcx+0x10],rax
    1190:   48 89 51 18             mov    QWORD PTR [rcx+0x18],rdx
    1194:   48 8b 45 d8             mov    rax,QWORD PTR [rbp-0x28]
    1198:   5d                      pop    rbp
    1199:   c3                      ret

From the disassembly it looks like before calling the function the required space is allocated on the stack and the function fills in those values.

But in the second part it looks like rdi is treated as pointer to a number struct where the values are also saved. What is that about?

And when using a C function in assembler how do I know where the result is?

Peter Cordes
  • 328,167
  • 45
  • 605
  • 847
iaquobe
  • 555
  • 1
  • 6
  • 23
  • 1
    You can find some information at [this question](https://stackoverflow.com/questions/68084306/function-return-value-optimization). – Steve Summit Jun 25 '21 at 16:39
  • Refer to the ABI document of your platform. For you, this seems to be the amd64 SysV ABI supplement. – fuz Jun 25 '21 at 16:44
  • @SteveSummit So it secretly passes a pointer. But why does it write stuff into the stack as well? It returns the struct on the stack and on the pointer. I used `-O0` is this why this redundancy was not removed? – iaquobe Jun 25 '21 at 16:45
  • 1
    @jaklh With `-O0` you tell the compiler to turn off its brain and generate as stupid code as possible. Why do you expect good code under such circumstances? – fuz Jun 25 '21 at 16:46
  • @fuz I'm not expecting good code. I'm asking if the stupid code is part of the convention or caused by `-O0` – iaquobe Jun 25 '21 at 16:50
  • 3
    That code is a lot harder to follow than necessary, because you disabled optimization. With optimization, you'd just see the stores into the return-value object, not initializing the local `res`. (That's the space *this* function is allocating on the stack.) https://godbolt.org/z/M79b6zqrP : gcc -O3 uses two SSE2 stores, gcc -O2 uses four qword stores of immediate 0. – Peter Cordes Jun 25 '21 at 17:23
  • 3
    Note that x86-64 System V is well documented, [Where is the x86-64 System V ABI documented?](https://stackoverflow.com/q/18133812). And that smaller structs (which fit in 16 bytes or less) are returned in RDX:RAX or just RAX. [C++ on x86-64: when are structs/classes passed and returned in registers?](https://stackoverflow.com/q/42411819) – Peter Cordes Jun 25 '21 at 17:26
  • @PeterCordes thanks for the links :). And yeah the reason that I used `-O0` is that sometimes functions and loops where optimized out entirely, which is hard to see how this would be done in a case that cannot be optimized. I didn't check on this example though. I'll check with `-O2`and `-O3` – iaquobe Jun 25 '21 at 18:23
  • Crafting functions that will compile to optimized asm that's interesting to look at usually means using function args and returning a value, so constant propagation can't optimize things away. [How to remove "noise" from GCC/clang assembly output?](https://stackoverflow.com/q/38552116). `-O1` and `-Og` can be useful sometimes. – Peter Cordes Jun 25 '21 at 18:25
  • @jaklh That's a question you can answer yourself by also compiling with optimisations turned on and then comparing the code. – fuz Jun 25 '21 at 19:23

1 Answers1

2

A calling convention typically does not specifically dictate any code or code sequences, it dictates only state — such as registers and memory, which goes to parameter passing and the stack: where parameters and return values go, what state must be preserved by the call (i.e. some registers and allocated stack memory), and what is scratch (i.e. some registers, and memory below the current stack pointer).  It may also dictate things like stack alignment requirements.

The calling convention speaks to state as per above: but only at very specific points in time, namely at the exact boundary when control is transferred from caller to callee, and again when control is transferred back from callee to caller.  Thus, the callee has an expectation that the caller has setup all the parameters as expected before its first instruction runs.  The caller has the expectation that the callee has setup all the return values (and preserved what ever it must preserve) before the first instruction of its resumption from the call.

For these purposes, the calling convention does not dictate machine code instructions or even sequences of instructions; it only establishes expectation of values and locations at the points of transfer.

Erik Eidt
  • 23,049
  • 2
  • 29
  • 53