2

Today, whilst disassembling some binaries I built earlier (clang x86_64), I came across something [seemingly] useless.

_baz:                                   ## @baz
    .cfi_startproc
## BB#0:
    pushq   %rax ; What?
Ltmp7:
    .cfi_def_cfa_offset 16
    leaq    (%rsp), %rax
    movq    %rsi, (%rax)
    xorl    %edx, %edx
    movq    %rax, %rsi
    callq   _something
    movq    %rax, %rdi
    callq   _something_else
    movl    (%rax), %eax
    popq    %rcx ; What?
    retq
    .cfi_endproc

I see rax being saved and then later restored to rcx, and I don't see the point of doing that, as rax is a scratch register (which doesn't seem to need saving here) and rcx (another scratch register, also for 4th register-passed arg) never seems to be used here.

Mona the Monad
  • 2,265
  • 3
  • 19
  • 30
  • 3
    This is the shortest way to fix the required 16-alignment of RSP – harold Aug 19 '17 at 20:28
  • That sounds understandable, but why rax and rcx? Why not push and pop just rax? – Mona the Monad Aug 19 '17 at 20:29
  • 4
    It does put something in `eax` presumably to return it, so popping into it would be bad. It could have used `rcx` both times though – harold Aug 19 '17 at 20:31
  • What registers does the *C Calling Convention - caller rules* state should be preserved and restored? – David C. Rankin Aug 19 '17 at 20:32
  • @harold Oh I forgot about that. Still, why couldn't all functions just sub/add rsp by multiples of 16? In that case, this function wouldn't really need the extra 8 byte push/pop if the caller pre-aligned the stack. – Mona the Monad Aug 19 '17 at 20:32
  • Then again, there is that return address... – Mona the Monad Aug 19 '17 at 20:37
  • Besides, how does it know when or in which function to do its stack align quirk? – Mona the Monad Aug 19 '17 at 20:44
  • 1
    Every non-leaf should align rsp before calling another function. Push is an efficient way to do so if no locals are needed. – Jester Aug 19 '17 at 21:03
  • It doesn't *know*, that's the reason why it will crash if you bork it by custom function in assembly without aligning. It just expects the `rsp` was aligned before call, i.e. it is +-8 misaligned at the beginning due to return address, and needs fixing if further calls will be done. – Ped7g Aug 20 '17 at 02:52
  • 3
    The use of a `push` seems to be better than RSP arithmetic on Intel processors that have special hardware support for optimizing the stack use. https://stackoverflow.com/a/37774474/597607 – Bo Persson Aug 20 '17 at 13:14
  • FYI, `#` is the comment character in x86 GAS syntax (even in `-masm=intel` mode). `;` works the same as in C, separating instructions / directives even if they're on the same line. – Peter Cordes Aug 21 '17 at 18:00
  • 1
    expanded my comments on possible performance downsides/upsides into an answer on the duplicate: https://stackoverflow.com/a/45823778/224132. – Peter Cordes Aug 22 '17 at 17:35

0 Answers0