mov
to EDX is pointless, the return-value register is AL / AX / EAX / RAX / RDX:RAX for widths from 1 byte up to 16 bytes on x86-64. EDX or RDX is only involved for wide return values, too wide to fit in RAX. (Or in 32-bit mode, 64-bit values are returned in the EDX:EAX register pair because there is no RAX.)
This is true for all standard x86 32-bit and x86-64 calling conventions, including the i386 and x86-64 System V ABIs used on GNU/Linux.
If you're writing a main
, or any function that you want to call from another file, it needs to be a .globl
symbol. (Unless you .include "foo.s"
instead of building separately + linking.) That's what makes it visible in the symbol table for the linker to resolve references to it. e.g. from the a call main
in the already-compiled code for _start
, in crt0.o
or something, which you can see gcc linking if you run gcc -v foo.S
. (That was an over-simplification; glibc's _start
actually passes main's address as an arg to __libc_start_main
, which is in libc.so.6
, so there is some code from libc proper that runs before main
. See Linux x86 Program Start Up
or - How the heck do we get to main()?)
If you're making a static executable without CRT (defining _start
instead of main
and making your own exit_group
system call), you can just throw instructions in a file and let the linker (ld
) choose the top of the .text
section as the ELF entry point if it doesn't find a _start
symbol. (Use readelf -a a.out
to see info like that.)
If you only plan to run the program under GDB to single-step a couple instructions you're curious about, you can even leave out the exit-cleanly part. (For this, use GDB's starti
command to run with a temp breakpoint before the first user-space instruction, so you don't have to set a breakpoint manually by absolute address (because there's no symbol).)
$ cat > foo.S
mov $1 + 2, %edi # do the math at assemble time
mov $231, %eax # _NR_exit_group
syscall
$ gcc -static -no-pie -nostdlib foo.S # like as + ld manually
/usr/bin/ld: warning: cannot find entry symbol _start; defaulting to 0000000000401000
$ ./a.out ; echo $?
3
$ strace ./a.out
execve("./a.out", ["./a.out"], 0x7ffe0706a3c0 /* 54 vars */) = 0
exit_group(3) = ?
+++ exited with 3 +++
If your system is 32-bit so as
defaults to 32-bit mode, use 32-bit int $0x80
with different registers.
Finally, what is the best resource for looking up the ops codes?
I usually leave a browser tab open to https://www.felixcloutier.com/x86/, which is an HTML scrape of Intel's vol.2 manual. The original PDF has some intro chapters on how to read the entries, so check it out if you find any of the notation confusing. There are older scrapes of Intel's manuals that leave out SIMD instructions, so that's useless for me but maybe what you want as a beginner.
Other resources are linked from the x86 tag wiki, including http://ref.x86asm.net/coder64.html which is organized by opcode, not by mnemonic, and has quick-reference columns to remind you whether an instruction reads or modifies FLAGS, and if so which, and stuff like that.