Need help which deciphering 64 bit Intel Assembly instructions to C

Question

I have a function:

long foo(long x, long y, long z);

I'm given this assembly output from GCC and I'm trying to figure out what's happening convert that to C function.

I'm new to assembly and don't know what's going on at all here.

foo:

subq   %rdx %rsi
imulq  %rsi %rdi 
movq   %rsi %rax 
salq   $63  %rax
sarq   $63  %rax
xorq   %rdi %rax

x,y,z are passed into registers %rdi, %rsi, & %rdx respectively

You didn't make clear what it is you need help with. Do you know what `subq` does? If so, were you able to translate the first instruction to equivalent C? If so, what did you get? If not, what was the issue with doing so? — David Schwartz, Oct 13 '16 at 18:42
hint: the shifts broadcast the low bit of rax to every bit in rax. [This answer](http://stackoverflow.com/questions/34407437/what-is-the-efficient-way-to-count-set-bits-at-a-position-or-lower/34410357#34410357) also uses an arithmetic shift as a bit-broadcast, but uses it to conditionally zero the result of something else, not to XOR it. — Peter Cordes, Oct 13 '16 at 18:47
In response to what I need help with, I don't know where to start. — ixbo45, Oct 13 '16 at 19:14
you should start with [Intel 64 and IA-32 Architectures Developer's Manual](http://www.intel.com/content/www/us/en/architecture-and-technology/64-ia-32-architectures-software-developer-manual-325462.html) — agg3l, Oct 13 '16 at 19:49

score 2 · Answer 1 · edited Oct 14 '16 at 06:27

2

You need to know the x86_64 calling convention for SystemV ABI conformant systems (e.g. POSIX/linux) to map the given registers to the arguments in the function prototype. Although this was given as a pre-condition, it's helpful to consult a reference that explains this in more detail: http://eli.thegreenplace.net/2011/09/06/stack-frame-layout-on-x86-64

Also, because the code uses %rdi, we know:

This is not windows because the windows ABI doesn't use rdi for any of its "passed in register" arguments
We know we're using "AT&T syntax" [because of the % prefix] and not "Intel syntax" ...
... so the destination register is the rightmost on an instruction.

The "q" suffix on all instructions means "quadword" (i.e. 64 bits). Under windows, long is 32 bits [go figure :-)], so, again, this is definitely SysV because under windows, the first instruction would a 32 bit one, using the 32 bit register names [and the actual registers would be different]:

    subl    %edx %esi

Most of the asm instructions should be intuitive.

Note that because of long, the operations are working on signed integers.

Combining all this, here is a sample program:

// x86_64 calling convention:
//   arg 0: %rdi
//   arg 1: %rsi
//   arg 2: %rdx
//   arg 3: %rdx
//   arg 4: %r8
//   arg 5: %r9

long
foo(long x,long y,long z)
// x -- %rdi
// y -- %rsi
// z -- %rdx
{
    long ret;

    // subq   %rdx %rsi
    y -= z;

    // imulq  %rsi %rdi
    x *= y;

    // movq   %rsi %rax
    ret = x;

    // salq   $63  %rax
    ret <<= 63;

    // sarq   $63  %rax
    ret >>= 63;

    // xorq   %rdi %rax
    ret ^= x;

    return ret;
}

[Prefaced by the movq] the two shift operations are a slight "trick".

The first one salq left shifts bit 0 into bit 63, which is the sign bit. The sarq is an arithmetic right shift that shifts in the sign bit on the left. The net effect is that all bits will be set to the sign bit.

So, this is equivalent to:

xor_mask = (x & 1) ? -1 : 0;

edited Oct 14 '16 at 06:27

Peter Cordes

328,167
45
605
847

answered Oct 13 '16 at 20:08

Craig Estey

30,627
4
24
48

The last line of the question indicates they were given the calling convention arg->register mapping. Also, you might want to be clear that gcc with AT&T syntax does work on Windows, and that it's the `%` that tells us AT&T syntax, and the use of RDI to pass the first arg tells us we're using the everything-but-Windows SystemV ABI. – Peter Cordes Oct 14 '16 at 02:37
Oh also, the Windows x86-64 ABI has 32-bit `long`. But yeah, you don't really need to spend too much time identifying the calling convention, since that's given in the question. – Peter Cordes Oct 14 '16 at 02:38
@PeterCordes I've reworked the answer a bit based on your comments – Craig Estey Oct 14 '16 at 03:17
The SystemV ABI doesn't necessarily imply POSIX. Any random embedded or custom OS can (and probably does) use it. So SysV -> POSIX is only true if you consider only major desktop/server OSes, because all the non-Windows ones are more or less POSIX. Other than that, the first part is now great; would upvote again :) – Peter Cordes Oct 14 '16 at 03:21
Oh BTW, I think you have a mistake in your final line: the return value isn't just 0 or -1. That is XORed with the multiply result, so it's a conditional bitwise-NOT. Modifying your vars (instead of declaring new vars to hold new values) makes the `x&1` in your final line confusing: It refers to `x` after the modifications from your code block, not the function arg. – Peter Cordes Oct 14 '16 at 03:25
@PeterCordes You sure are holding my feet to the fire on this one :-) I was pressed for time when I posted the original. The `x&1` was intended to be _just_ about the shifts to demystify them. It wasn't intended to include the final xor. I've clarified that a bit. P.S. I had thought of you when I saw that the linked document had "the red zone" in it :-) – Craig Estey Oct 14 '16 at 03:40
1

Indeed, pointing out esoteric details is one of my favourite pass-times. :) The red zone comes up all the time in SO questions about doing something ill-advised with inline asm. – Peter Cordes Oct 14 '16 at 06:23
1

@PeterCordes _Indeed, pointing out esoteric details is one of my favourite pass-times._ That isn't very Canadian ... Are you a "troublemaker"? :-) – Craig Estey Oct 14 '16 at 06:52

Need help which deciphering 64 bit Intel Assembly instructions to C

1 Answers1