How do I convert assembly code to C code?

Question

I need some help with this problem. I am trying to reverse engineer code from assembly code but I keep being off by 1 command and I am confused to why. The assembly code I am given (with my notes in #) is this.

   0x00005555555545fa <+0>: push   %rbp
   0x00005555555545fb <+1>: mov    %rsp,%rbp #int x,y
=> 0x00005555555545fe <+4>: movl   $0x2b,-0xc(%rbp) #set x = 43
   0x0000555555554605 <+11>:    movl   $0x2fd,-0x8(%rbp) #set y = 765
   0x000055555555460c <+18>:    mov    -0xc(%rbp),%edx #load x into %edx 
   0x000055555555460f <+21>:    mov    -0x8(%rbp),%eax #load y into %eax
   0x0000555555554612 <+24>:    add    %edx,%eax #add x and y
   0x0000555555554614 <+26>:    mov    %eax,-0x4(%rbp) # y = x + y
   0x0000555555554617 <+29>:    mov    $0x0,%eax
   0x000055555555461c <+34>:    pop    %r bp
   0x000055555555461d <+35>:    retq

Here is my current code and the resulting assembly code

int main(){
        int x,y;
        x = 43;
        y = 765;
        x = y;
        y = y + x;
}

   0x00005555555545fa <+0>: push   %rbp
   0x00005555555545fb <+1>: mov    %rsp,%rbp
=> 0x00005555555545fe <+4>: movl   $0x2b,-0x8(%rbp)
   0x0000555555554605 <+11>:    movl   $0x2fd,-0x4(%rbp)
   0x000055555555460c <+18>:    mov    -0x4(%rbp),%eax
   0x000055555555460f <+21>:    mov    %eax,-0x8(%rbp)
   0x0000555555554612 <+24>:    mov    -0x8(%rbp),%eax
   0x0000555555554615 <+27>:    add    %eax,-0x4(%rbp)
   0x0000555555554618 <+30>:    mov    $0x0,%eax
   0x000055555555461d <+35>:    pop    %rbp
   0x000055555555461e <+36>:    retq

I am having a really hard time figuring out how to 'load x into %edx' and then subsequently adding %eax and %edx. Any help is appriciated c:

Make an attempt yourself. Consult the instruction set reference about what each instruction does, and edit your question with what you think. Hint: `-x(%rbp)` are local variables. — Jester, Jul 14 '20 at 23:29
You can look up instructions in Intel's manual (e.g. HTML extract here: https://www.felixcloutier.com/x86/idiv). `idivl` is of course AT&T syntax for idiv with dword operand-size. Disassemble as Intel Syntax instead of AT&T if you understand that better. See also [When and why do we sign extend and use cdq with mul/div?](https://stackoverflow.com/q/36464879) — Peter Cordes, Jul 15 '20 at 03:55
A different compiler or compiler version might implement the same logic a different way. 2 loads and an add reg,reg is not meaningfully different from a memory-destination add. Remember that this code does nothing with the results anyway, so would just optimize to `xor %eax,%eax` / `ret` if compiled with optimization enabled. It's also not significant that your compiler output uses different offsets from RBP than the original; both are using space in the red-zone below RSP that's part of this function's stack frame. — Peter Cordes, Jul 16 '20 at 21:51

score 1 · Answer 1 · answered Jul 14 '20 at 23:42

1

There is no easy-to-use tool to do such a conversion. (Going the other direction is simple: compile some high level code into assembly.)

While there are some examples of doing such a translation as your question, so far it is too complex a problem for too little reward. It is somewhat like picking up the debris after a vehicle crash and determining what condition the vehicle(s) were in before the crash. NTSB frequently does exactly such a thing after a major accident, but the amount of effort is considerable.

answered Jul 14 '20 at 23:42

wallyk

56,922
16
83
148

1

Actually ghidra (https://ghidra-sre.org/) comes with a decompiler that gives a passable representation of the source code. – thurizas Jul 14 '20 at 23:48
1

Avast's RetDec (Retargetable Decompiler), which is open source and based on LLVM, also does a decent job of generating reasonable source code. – fpmurphy Jul 17 '20 at 15:53

How do I convert assembly code to C code?

1 Answers1