The assembly code appeared to be computer generated, and something that was probably compiled by GCC since there is a repz retq
after an unconditional branch (call
). There is also an indication that because there isn't a tail call (jmp
) instead of a call
when going to mystery_util
that the code was compiled with -O1
(higher optimization levels would likely inline the function which didn't happen here). The lack of frame pointers and extra load/stores indicated that it isn't compiled with -O0
Multiplying x
by 7 is the same as multiplying x
by 8 and subtracting x
. That is what the following code is doing:
lea 0x0(,%rdi, 8), %edi
sub %eax, %edi
LEA can compute addresses but it can be used for simple arithmetic as well. The syntax for a memory operand is displacement(base, index, scale). Scale can be 1, 2, 4, 8. The computation is displacement + base + index * scale. In your case lea 0x0(,%rdi, 8), %edi
is effectively EDI = 0x0 + RDI * 8 or EDI = RDI * 8. The full calculation is n * 7 - 4;
The calculation for mystery_util
appears to simply be
n &= (n>>1) & 1;
If I take all these factors together we have a function mystery
that passes n * 7 - 4 to a function called mystery_util
that returns n &= (n>>1) & 1
.
Since mystery_util
returns a single bit value (0 or 1) it is reasonable that bool
is the return type.
I was curious if I could get a particular version of GCC with optimization level 1 (-O1
) to reproduce this assembly code. I discovered that GCC 4.9.x will yield this exact assembly code for this given C program:
#include<stdbool.h>
bool mystery_util(unsigned int n)
{
n &= (n>>1) & 1;
return n;
}
bool mystery(unsigned int n)
{
return mystery_util (7*n+4);
}
The assembly output is:
mystery_util:
movl %edi, %eax
shrl %eax
andl $1, %edi
andl %edi, %eax
ret
mystery:
movl %edi, %eax
leal 0(,%rdi,8), %edi
subl %eax, %edi
addl $4, %edi
call mystery_util
rep ret
You can play with this code on godbolt.
Important Update - Version without bool
I apparently erred in interpreting the question. I assumed the person asking this question determined by themselves that the prototype for mystery
was int mystery(int n)
. I thought I could change that. According to a related question asked on Stackoverflow a day later, it seems int mystery(int n)
is given to you as the prototype as part of the assignment. This is important because it means that a modification has to be made.
The change that needs to be made is related to mystery_util
. In the code to be reverse engineered are these lines:
mov %edi, %eax
shr %eax
EDI is the first parameter. SHR is logical shift right. Compilers would only generate this if EDI was an unsigned int
(or equivalent). int
is a signed type an would generate SAR (arithmetic shift right). This means that the parameter for mystery_util
has to be unsigned int
(and it follows that the return value is likely unsigned int
. That means the code would look like this:
unsigned int mystery_util(unsigned int n)
{
n &= (n>>1) & 1;
return n;
}
int mystery(int n)
{
return mystery_util (7*n+4);
}
mystery
now has the prototype given by your professor (bool
is removed) and we use unsigned int
for the parameter and return type of mystery_util
. In order to generate this code with GCC 4.9.x I found you need to use -O1 -fno-inline
. This code can be found on godbolt. The assembly output is the same as the version using bool
.
If you use unsigned int mystery_util(int n)
you would discover that it doesn't quite output what we want:
mystery_util:
movl %edi, %eax
sarl %eax ; <------- SAR (arithmetic shift right) is not SHR
andl $1, %edi
andl %edi, %eax
ret