48

I have some gnu assembler code for the x86_64 architecture generated by a tool and there are these instructions:

movq %rsp, %rbp  
leaq str(%rip), %rdi
callq puts
movl $0, %eax

I can not find actual documentation on the "callq" instruction.

I have looked at http://support.amd.com/TechDocs/24594.pdf which is "AMD64 Architecture Programmer’s Manual Volume 3: General-Purpose and System Instructions" but they only describe CALL near and far instructions.

I have looked at documentation for gnu assembler https://sourceware.org/binutils/docs/as/index.html but could not find the section detailing the instructions it supports.

I understand that its a call to a function, but I would like to know the details. Where can I find them?

Peter Cordes
  • 328,167
  • 45
  • 605
  • 847
user10607
  • 3,011
  • 4
  • 24
  • 31
  • 3
    @prl: I didn't find any existing questions about `callq` specifically. And there is actually non-obvious stuff to say about default operand-sizes for branches and other stack instructions. This is a newbie question, but actually one that's worth answering for a change, if I stop being grumpy with all the bad "my code doesn't work and I don't know anything" questions. – Peter Cordes Oct 15 '17 at 09:26
  • Some of the answers were TL;DR so maybe someone already mentioned it, but `callq` holds the offset of the next instruction as its first operand. I don't know how it makes use of it though. – vmemmap Sep 09 '22 at 07:36

2 Answers2

64

It's just call. Use Intel-syntax disassembly if you want to be able to look up instructions in the Intel/AMD manuals. (objdump -drwC -Mintel, GBD set disassembly-flavor intel, GCC -masm=intel)

The q operand-size suffix does technically apply (it pushes a 64-bit return address and treats RIP as a 64-bit register), but there's no way to override it with instruction prefixes. i.e. calll and callw aren't encodeable in 64-bit mode according to Intel's manual, so it's just annoying that some AT&T syntax tools show it as callq instead of call. This of course applies to retq as well.

Different tools are different in 32 vs. 64-bit mode. (Godbolt)

  • gcc -S: always call/ret. Nice.

  • clang -S: callq/retq and calll/retl. At least it's consistently annoying.

  • objdump -d: callq/retq (explicit 64-bit) and call/ret (implicit for 32-bit). Inconsistent and kinda dumb because 64-bit has no choice of operand-size, but 32-bit does. (Not a useful choice, though: callw truncates EIP to 16 bits.)

    Although on the other hand, the default operand size (without a REX.W prefix) for most instructions in 64-bit mode is still 32. But add $1, (%rdi) needs an operand-size suffix; the assembler won't pick 32-bit for you if nothing implies one. OTOH, push is implicitly pushq, even though pushw $1 and pushq $1 are both encodeable (and usable in practice) in 64-bit mode.

GAS in 64-bit mode will assemble callw foo / foo: to 66 e8 00 00, but my Skylake CPU single-steps it as a 6-byte instruction, consuming 2 bytes of 00 after it. And changing RSP by 8. So it decodes it as callq with a rel32=0, ignoring the 66 operand-size prefix. So even though there's no choice of operand-size, GNU Binutils thinks there is. (Tested with GAS 2.38). So it's still odd that it uses suffixes in 64-bit mode but not 32, since it thinks the situation is the same in both modes.

Clang and llvm-objdump -d have the same bug, assembling / disassembling callw in 64-bit mode.

AMD's manual says 64-bit mode can't use 32-bit operand-size, but does not mention any limitation on using 16-bit operand-size. So perhaps GAS and LLVM are correct for AMD CPUs, and there is still the same choice of 66 prefix or not, as in 32-bit mode. (You could test by seeing if RIP = 0x1004 after single-stepping callw foo / foo: in a static executable, instead of 0x401006, with the .text section starting at 0x401000.)

NASM's ndisasm -b64 assumes that a 66 prefix will be ignored in 64-bit mode, disassembling 66E800000000 as call qword 0x18c (it doesn't understand ELF metadata, so I just padded with nops and found it in disassembly of a .o as if it were a flat binary, hence the unusual address.)


From Intel's instruction-set ref manual (linked above):

For a near call absolute, an absolute offset is specified indirectly in a general-purpose register or a memory location (r/m16, r/m32, or r/m64). The operand-size attribute determines the size of the target operand (16, 32 or 64 bits). When in 64-bit mode, the operand size for near call (and all near branches) is forced to 64-bits.

for rel32 ... As with absolute offsets, the operand-size attribute determines the size of the target operand (16, 32, or 64 bits). In 64-bit mode the target operand will always be 64-bits because the operand size is forced to 64-bits for near branches.

In 32-bit mode, you can encode a 16-bit call rel16 that truncates EIP to 16 bits, or a call r/m16 that uses an absolute 16-bit address. But as the manual says, the operand-size is fixed in 64-bit mode.

This is unlike the situation with push, where it defaults to 64-bit in 64-bit mode, but can be overridden to 16 with an operand-size prefix. (But not to 32 with a REX.W=0). So pushq and pushw are both available, but only callq.

Peter Cordes
  • 328,167
  • 45
  • 605
  • 847
-1

callq refers to a relocatable call in shared libraries/dynamic libraries. The idea is push 0, then push the symbol of to search then call a function so search for it on the first call. In the relocatable table of the program, it replaces the call to the actual location of the function on the first call of the function. Subsequent calls refer to the relocation table that was created at run time.

  • 6
    You're describing the PLT, which Unix systems traditionally use for dynamic linking (https://www.macieira.org/blog/2012/01/sorry-state-of-dynamic-libraries-on-linux/). That's optional: `gcc -fno-plt` does early binding, inlining an indirect `callq *puts@GOT(%rip)`. **The `callq` mnemonic doesn't imply use of the PLT**, and is used even for calls between functions in the same file. – Peter Cordes Jun 03 '18 at 04:31