You declare this int
in C as well? int
is a 32-bit type in the x86-64 System V ABI, which you're using.
But you're operating on qword long
operands, after telling the compiler the upper 32 bits of the arg didn't matter. So when you pass -5
as an int
arg, you get 0x00000000fffffffb
in RDI, which is a 64-bit 2's complement positive integer, 4294967291. (This is assuming that the upper bits are zero, e.g. if the caller used mov $-5, %edi
like a C compiler would with a constant arg. But you could get arbitrary garbage if casting a long
to an int
: the caller will assume that this function ignores high bits as advertised.)
You forgot to mov %rdi, %rax
in case the arg is positive. (Then do a conditional neg %rax
) Or actually, mov %edi, %eax
/ neg %eax
.
So your function leaves RAX unmodified when RDI is positive. (But since the caller was probably un-optimized C from gcc -O0
, it happens to also hold a copy of the arg. This explains the abs(-5)
=> -5
behaviour, rather than the semi-random garbage you'd expect from a bug like this).
The x86-64 System V calling convention does not require the caller to sign-extend narrow args or return values to full register width, so your cmp
/ jl
depends on whatever garbage the caller left in the upper half of RDI.
(Is a sign or zero extension required when adding a 32bit offset to a pointer for the x86-64 ABI?). There's an undocumented behaviour of sign or zero extending narrow args to 32-bit, but not to 64.
The natural / default size for most operations is 32-bit operand-size, 64-bit address size, with implicit zero-extension to 64-bit when writing a 32-bit register. Use 32-bit operand-size unless you have a specific reason for using 64-bit (e.g. for pointers).
Look at what compilers normally do; they normally use cmov
for a branchless abs()
. See compiler output for long foo(long x) { return std::abs(x); }
from a C++ compiler with optimization enabled, e.g. https://godbolt.org/z/I3NSIZ for long
and int
versions.
# clang7.0 -O3
foo(int): # @foo(int)
movl %edi, %eax
negl %eax # neg sets flags according to eax = 0-eax
cmovll %edi, %eax # copy the original if that made it negative
retq
# gcc7.3 -O3
foo(int):
movl %edi, %edx
movl %edi, %eax
sarl $31, %edx
xorl %edx, %eax # 2's complement identity hack. Maybe nice with slow cmov
subl %edx, %eax
ret
But unfortunately gcc doesn't switch over to cmov even with -march=skylake
where it's 1 uop.