2

I'm trying to understand assembly in x86 more. I have a mystery function here that I know returns an int and takes an int argument. So it looks like int mystery(int n){}. I can't figure out the function in C however. The assembly is:

mov  %edi, %eax
lea  0x0(,%rdi, 8), %edi
sub  %eax, %edi
add  $0x4, %edi
callq < mystery _util >
repz retq

< mystery _util >
mov  %edi, %eax
shr  %eax
and  $0x1, %edi
and  %edi, %eax
retq

I don't understand what the lea does here and what kind of function it could be.

Michael Petch
  • 46,082
  • 8
  • 107
  • 198
  • 2
    Please don't post pictures of code and code, copy & paste them in your question. – Pablo Mar 18 '18 at 02:46
  • Do not post images of code. Also, which function are you talking about and make sure you show all of it. Describe what you have figured out so far and where you got stuck. – Jester Mar 18 '18 at 02:46
  • Welcome to stackoverflow.com. Please take some time to read [the help pages](http://stackoverflow.com/help), especially the sections named ["What topics can I ask about here?"](http://stackoverflow.com/help/on-topic) and ["What types of questions should I avoid asking?"](http://stackoverflow.com/help/dont-ask). Also please [take the tour](http://stackoverflow.com/tour) and [read about how to ask good questions](http://stackoverflow.com/help/how-to-ask). Lastly please learn how to create a [Minimal, Complete, and Verifiable Example](http://stackoverflow.com/help/mcve). – Some programmer dude Mar 18 '18 at 02:46
  • I always found that syntax extremely confusing. What the hell even generates disassembly that way? – Havenard Mar 18 '18 at 03:08
  • 1
    That's AT&T assembly syntax, and the GNU assembler expects it by default. – lockcmpxchg8b Mar 18 '18 at 03:12
  • @lockcmpxchg8b Good to know the creature has a name. I'm used to WinDbg, always found it's syntax very intuitive, as incredible as it may sound when speaking of assembly. – Havenard Mar 18 '18 at 03:15
  • 1
    WinDbg uses "Intel syntax", which I also prefer. The hardest part for me to keep straight with AT&T is the fact the destination is on the _right_, which just breaks my brain. – lockcmpxchg8b Mar 18 '18 at 03:23
  • 3
    @Havenard: GNU binutils defaults to AT&T syntax. Use `objdump -drwC -Mintel` to get GAS's version of Intel-syntax (which is like MASM, not NASM). – Peter Cordes Mar 18 '18 at 03:38
  • I've amended my answer with a section that removes the bool and uses a prototype of `int mystery(int n)` for function `mystery`. The first part of the answer does it with `bool` which may not be allowed based on information I was given on another question related to this. – Michael Petch Mar 19 '18 at 02:07
  • In `mystery_util` is `and $0x1, %edi` a typo and it meant to be `add $0x1, %edi`? – Michael Petch Mar 19 '18 at 03:13

3 Answers3

9

The assembly code appeared to be computer generated, and something that was probably compiled by GCC since there is a repz retq after an unconditional branch (call). There is also an indication that because there isn't a tail call (jmp) instead of a call when going to mystery_util that the code was compiled with -O1 (higher optimization levels would likely inline the function which didn't happen here). The lack of frame pointers and extra load/stores indicated that it isn't compiled with -O0

Multiplying x by 7 is the same as multiplying x by 8 and subtracting x. That is what the following code is doing:

lea  0x0(,%rdi, 8), %edi
sub  %eax, %edi

LEA can compute addresses but it can be used for simple arithmetic as well. The syntax for a memory operand is displacement(base, index, scale). Scale can be 1, 2, 4, 8. The computation is displacement + base + index * scale. In your case lea 0x0(,%rdi, 8), %edi is effectively EDI = 0x0 + RDI * 8 or EDI = RDI * 8. The full calculation is n * 7 - 4;

The calculation for mystery_util appears to simply be

n &= (n>>1) & 1;

If I take all these factors together we have a function mystery that passes n * 7 - 4 to a function called mystery_util that returns n &= (n>>1) & 1.

Since mystery_util returns a single bit value (0 or 1) it is reasonable that bool is the return type.

I was curious if I could get a particular version of GCC with optimization level 1 (-O1) to reproduce this assembly code. I discovered that GCC 4.9.x will yield this exact assembly code for this given C program:

#include<stdbool.h>

bool mystery_util(unsigned int n)
{
    n &= (n>>1) & 1;
    return n;
}

bool mystery(unsigned int n)
{
    return mystery_util (7*n+4);
}

The assembly output is:

mystery_util:
        movl    %edi, %eax
        shrl    %eax
        andl    $1, %edi
        andl    %edi, %eax
        ret
mystery:
        movl    %edi, %eax
        leal    0(,%rdi,8), %edi
        subl    %eax, %edi
        addl    $4, %edi
        call    mystery_util
        rep ret

You can play with this code on godbolt.


Important Update - Version without bool

I apparently erred in interpreting the question. I assumed the person asking this question determined by themselves that the prototype for mystery was int mystery(int n). I thought I could change that. According to a related question asked on Stackoverflow a day later, it seems int mystery(int n) is given to you as the prototype as part of the assignment. This is important because it means that a modification has to be made.

The change that needs to be made is related to mystery_util. In the code to be reverse engineered are these lines:

mov  %edi, %eax
shr  %eax

EDI is the first parameter. SHR is logical shift right. Compilers would only generate this if EDI was an unsigned int (or equivalent). int is a signed type an would generate SAR (arithmetic shift right). This means that the parameter for mystery_util has to be unsigned int (and it follows that the return value is likely unsigned int. That means the code would look like this:

unsigned int mystery_util(unsigned int n)
{
    n &= (n>>1) & 1;
    return n;
}

int mystery(int n)
{
    return mystery_util (7*n+4);
}

mystery now has the prototype given by your professor (bool is removed) and we use unsigned int for the parameter and return type of mystery_util. In order to generate this code with GCC 4.9.x I found you need to use -O1 -fno-inline. This code can be found on godbolt. The assembly output is the same as the version using bool.

If you use unsigned int mystery_util(int n) you would discover that it doesn't quite output what we want:

mystery_util:
        movl    %edi, %eax
        sarl    %eax          ; <------- SAR (arithmetic shift right) is not SHR
        andl    $1, %edi
        andl    %edi, %eax
        ret
Michael Petch
  • 46,082
  • 8
  • 107
  • 198
  • Unless I'm missing something, `n & (n>>1) & 1` is a funky way to write `n%4 == 3` (`and` / `cmp` / `setz`) , but gcc doesn't know it can optimize one to the other. https://godbolt.org/g/5Twi35 With `__attribute__((noinline))`, we get pretty similar code at -O3, still with the missed-optimization in `mystery` with the wasted `mov`. Well spotted that the lack of tail-call optimization in the code in the question implies `-O1`, not just `noinline` or something. – Peter Cordes Mar 18 '18 at 06:11
  • @PeterCordes Originally when I looked at this question I used a `noinline` as an experiment, but I got pulled back by the thought that this code was probably generated with some basic compiler options by a particular GCC. Had it not been for that I would have never bothered with this question. I was only curious if I could get an exact translation. I was bored lol – Michael Petch Mar 18 '18 at 06:14
2

The LEA is just a left-shift by 3, and truncating the result to 32 bit (i.e. zero-extending EDI into RDI implicilty). x86-64 System V passes the first integer arg in RDI, so all of this is consistent with one int arg. LEA uses memory-operand syntax and machine encoding, but it's really just a shift-and-add instruction. Using it as part of a multiply by a constant is a common compiler optimization for x86.

The compiler that generated this function missed an optimization here; the first mov could have been avoided with

lea  0x0(,%rdi, 8), %eax     # n << 3 = n*8
sub  %edi, %eax              # eax = n*7
lea  4(%rax), %edi           # rdi = 4 + n*7

But instead, the compiler got stuck on generating n*7 in %edi, probably because it applied a peephole optimization for the constant multiply too late to redo register allocation.


mystery_util returns the bitwise AND of the low 2 bits of its arg, in the low bit, so a 0 or 1 integer value, which could also be a bool.

(shr with no count means a count of 1; remember that x86 has a special opcode for shifts with an implicit count of 1. 8086 only has counts of 1 or cl; immediate counts were added later as an extension and the implicit-form opcode is still shorter.)

Peter Cordes
  • 328,167
  • 45
  • 605
  • 847
  • Actually I think only one LEA is needed. Like `lea 4(,%rdi,8), %eax` / `sub %edi, %eax` (leaving the result in EAX, not EDI). If the result *is* needed back in EDI, a `mov` is even cheaper than LEA. – Peter Cordes Oct 12 '20 at 07:32
1

The LEA performs an address computation, but instead of dereferencing the address, it stores the computed address into the destination register. In AT&T syntax, lea C(b,c,d), reg means reg = C + b + c*d where C is a constant, and b,c are registers and d is a scalar from {1,2,4,8}. Hence you can see why LEA is popular for simple math operations: it does quite a bit in a single instruction. (*includes correction from prl's comment below)

There are some strange features of this assembly code: the repz prefix is only strictly defined when applied to certain instructions, and retq is not one of them (though the general behavior of the processor is to ignore it). See Michael Petch's comment below with a link for more info. The use of lea (,rdi,8), edi followed by sub eax, edi to compute arg1 * 7 also seemed strange, but makes sense once prl noted the scalar d had to be a constant power of 2. In any case, here's how I read the snippet:

mov  %edi, %eax          ; eax = arg1
lea  0x0(,%rdi, 8), %edi ; edi = arg1 * 8
sub  %eax, %edi          ; edi = (arg1 * 8) - arg1 = arg1 * 7
add  $0x4, %edi          ; edi = (arg1 * 7) + 4
callq < mystery _util >  ; call mystery_util(arg1 * 7 + 4)
repz retq                ; repz prefix on return is de facto nop.


< mystery _util >
mov  %edi, %eax          ; eax = arg1
shr  %eax                ; eax = arg1 >> 1
and  $0x1, %edi          ; edi = 1 iff arg1 was odd, else 0
and  %edi, %eax          ; eax = 1 iff smallest 2 bits of arg1 were both 1.
retq

Note the +4 on the 4th line is entirely spurious. It cannot affect the outcome of mystery_util.

So, overall this ASM snippet computes the boolean (arg1 * 7) % 4 == 3.

lockcmpxchg8b
  • 2,205
  • 10
  • 16
  • @MichaelPetch: that's a fascinating read. I've seen obfuscators use it liberally as well. – lockcmpxchg8b Mar 18 '18 at 04:08
  • @MichaelPetch It has something to do with branch prediction, but as far as pure algorithmic logic goes it has the same effect of a simple `ret`. – Havenard Mar 18 '18 at 04:11
  • 1
    In the addressing mode syntax, `C(b,c,d)`, `d` cannot be a register, it must be a constant with the value 1, 2, 4, or 8. – prl Mar 18 '18 at 04:34
  • Assuming that ignored prefixes will stay ignored on future CPUs is not safe in general, but gcc has been using `rep ret` by default for years, so there are *many* binaries in the wild that CPU vendors won't want to break. That particular workaround for `ret` being a branch target (in this case of the `ret` in `mystery_util`) is pretty much future-proof for at least several decades. See https://stackoverflow.com/questions/20526361/what-does-rep-ret-mean. – Peter Cordes Mar 18 '18 at 05:25
  • 1
    I only got interested in this question based on your insistence that this was likely obfuscated code. I was able to reproduce a _C_ program with GCC 4.9.x that produced the exact code as the question with `-O1` optimization level. I suspect a prof simply took a simple program, compiled it with GCC using `-O1` option, took the assembly output and gave it to students to reverse engineer. – Michael Petch Mar 18 '18 at 06:30
  • With @prl's correction (i.e., why `a*7` is implemented as `a*8-a`, and your info on `repz ret`, there's only one unexplainable aspect of this code (the useless +4) --- and I totally buy that as poor code generation. I'll edit the answer to remove my speculation. – lockcmpxchg8b Mar 18 '18 at 10:12
  • Oh, good point, the `+4` doesn't affect the `n % 4` result. MichaelPetch shows compelling evidence that it was compiled with `-O1`, which is why there's no inlining. With `-O2` or higher, `mystery_util` inlines and the `+4` should optimize away. But obviously it can't if `mystery` doesn't know what `mystery_util` is going to do with its arg. i.e. it's only useless if you consider both functions together, which the compiler doesn't without inlining. (And BTW, you forgot to remove the comment in the code block on the `repz ret`. This is not obfuscated code, it's just normal gcc output!) – Peter Cordes Mar 18 '18 at 13:24