1

I'm to convert the following AT&T x86 assembly into C:

  movl 8(%ebp), %edx
  movl $0, %eax
  movl $0, %ecx
  jmp .L2
.L1
  shll $1, %eax
  movl %edx, %ebx
  andl $1, %ebx
  orl %ebx, %eax
  shrl $1, %edx
  addl $1, %ecx
.L2
  cmpl $32, %ecx
  jl   .L1
  leave

But must adhere to the following skeleton code:

int f(unsigned int x) {
    int val = 0, i = 0;
    while(________) {
        val = ________________;
        x = ________________;
        i++;
    }
    return val;
}

I can tell that the snippet

.L2
  cmpl $32, %ecx
  jl   .L1

can be interpreted as while(i<32). I also know that x is stored in %edx, val in %eax, and i in %ecx. However, I'm having a hard time converting the assembly within the while/.L1 loop into condensed high-level language that fits into the provided skeleton code. For example, can shll, shrl, orl, and andl simply be written using their direct C equivalents (<<,>>,|,&), or is there some more nuance to it?

Is there a standardized guide/"cheat sheet" for Assembly-to-C conversions?

I understand assembly to high-level conversion is not always clear-cut, but there are certainly patterns in assembly code that can be consistently interpreted as certain C operations.

velkoon
  • 871
  • 3
  • 15
  • 35
  • 1
    Shifts are not *necessarily* 'bitwise operators. A well-established use is `x<<1` to stand for "multiply by 2". A C compiler might even translate an explicit multiplication by a power of 2 into a shift. – Jongware Feb 07 '18 at 10:10
  • 1
    [This playground](https://godbolt.org/) may help. – Jabberwocky Feb 07 '18 at 10:12
  • 1
    The answer to [SO: Is there a complete x86 assembly language reference that uses AT&T syntax? (closed)](https://stackoverflow.com/a/1776587/7478597) mentions [x86 Assembly Language Reference Manual](https://docs.oracle.com/cd/E19253-01/817-5477/817-5477.pdf). – Scheff's Cat Feb 07 '18 at 10:13
  • @usr2564301 The 'bitwise operators' is a de facto standard name for the operators `| & ^ << >> ~` (and their compound assignment versions). No matter how they are used by the program. – Lundin Feb 07 '18 at 10:13
  • 1
    "Is there a standardized guide/"cheat sheet" for Assembly-to-C conversions?" Hardly. The usual is the opposite way. :-) – Scheff's Cat Feb 07 '18 at 10:14
  • I see you are asking many questions about decompiling. Are you implementing something like IDA's Hex-Rays? If so, take a look at Avast's RetDec decompiler, which is opensource. – arrowd Feb 07 '18 at 10:16
  • @arrowd Yes, I should start learning to use IDA, thanks for the suggestion – velkoon Feb 07 '18 at 10:38
  • @Scheff Thanks, that should prove quite helpful. – velkoon Feb 07 '18 at 10:40
  • @usr2564301 - That's the exact sort of information I was looking for, thanks. I'm sure there's more similar hints on interpreting assembly code out there.. – velkoon Feb 07 '18 at 10:42
  • Why do you keep asking and deleting questions on the same subject? – fuz Feb 07 '18 at 15:16
  • I’d be surprised if you had evidence to back that statement. I wrote 1 question about assembly last week and deleted that after it was downvoted and I was told the question was unconventional (“weird”). That’s the only deleting I have done recently. – velkoon Feb 07 '18 at 20:27

2 Answers2

1

For example, can shll, shrl, orl, and andl simply be written using their direct C equivalents (<<,>>,|,&), or is there some more nuance to it?

they can. Let's examine the loop body step-by-step:

  shll $1, %eax    // shift left eax by 1, same as "eax<<1" or even "eax*=2"
  movl %edx, %ebx
  andl $1, %ebx    // ebx &= 1
  orl %ebx, %eax   // eax |= ebx
  shrl $1, %edx    // shift right edx by 1, same as "edx>>1" = "edx/=2"

gets us to

  %eax *=2
  %ebx = %edx        
  %ebx = %ebx & 1       
  %eax |= %ebx     
  %edx /= 2

ABI tells us (8(%ebp), %edx) that %edx is x, and %eax (return value) is val:

  val *=2
  %ebx = x           // a
  %ebx = %ebx & 1    // b
  val |= %ebx        // c
  x /= 2

combine a,b,c: #2 insert a into b:

  val *=2
  %ebx = (x & 1)  // b
  val |= %ebx     // c
  x /= 2

combine a,b,c: #2 insert b into c:

  val *=2
  val |= (x & 1)
  x /= 2

final step: combine both 'val =' into one

  val = 2*val | (x & 1)
  x /= 2
Tommylee2k
  • 2,683
  • 1
  • 9
  • 22
  • Excellent explanation. Quite easy to understand now! I didn't realize it was that simple. Feel free to ignore this next question, but I'm now asked to "Describe in English what function f computes:" and was wondering if you happened to immediately recognize what the function was doing in layman's terms. – velkoon Feb 07 '18 at 14:07
  • 1
    left shifting one register 32 times, while right shifting the other. My guess is this function reverses the order of x's bits – Tommylee2k Feb 07 '18 at 14:30
0

while (i < 32) { val = (val << 1) | (x & 1); x = x >> 1; i++; } except val and the return value should be unsigned and they aren't in your template. The function returns the bits in x reversed.

The actual answer to your question is more complicated and is pretty much: no there is no such guide and it can't exist because compilation loses information and you can't recreate that lost information from assembler. But you can often make a good educated guess.

Art
  • 19,807
  • 1
  • 34
  • 60