1

Does anyone know how convert C++ code to assembly code and then do the reverse? The forward way is very easy:

g++ -S

I want to analyze the output and see if it has been compiled correctly (Just for curiosity now, but it can have some applications). However, my knowledge of assembly is very limited and the output is hard to understand (This is especially true if I use optimizations (-O) or compile with debug info (-g) ).

Is there a de-assembler for C++ (GCC) to produce C++ code? If not, is there any intermediate representation that I can compile C++ code into and then back from it?

There seems to be some ways for converting C++ to C here. Does GCC have anything for this?

Community
  • 1
  • 1
Shayan Pooya
  • 1,049
  • 1
  • 13
  • 22
  • 3
    After you compile a C++ program, there is no way get back the source verbatim from the assembly for all but the most trivial programs (unless your compiler does something weird like include the source code in the binary or something). You can get a very rough approximation, but that's it. – Seth Carnegie Nov 01 '11 at 02:53
  • 4
    For the PC you can use the `-masm=intel` option to get more readable (conventional) syntax – Cheers and hth. - Alf Nov 01 '11 at 02:54
  • I hope there is some tool that can generate the C++ version of Java bytecodes that can be de-assembled easily. If it is possible for Java then it is possible for C++ – Shayan Pooya Nov 01 '11 at 02:56
  • 4
    What is your purpose in doing this? If it's actually just to verify the compiler works, I wouldn't bother. If its for any other reason, I suggest you look into learning assembly. – jli Nov 01 '11 at 03:00
  • 1
    @Shayan: "If it is possible for Java then it is possible for C++" No, it's not actually. C++ and Java are not the same thing, and in terms of compilation models are worlds apart. – Nicol Bolas Nov 01 '11 at 03:03

4 Answers4

4

De-compiling assembly language back to C++ is possible (e.g., with HexRays), within some constraints -- primarily that although the C++ you get out will reflect the basic algorithms correctly, it probably won't look much like the original source code (though C++ name mangling does help maintain something closer to the original than you usually get with many other languages).

Jerry Coffin
  • 476,176
  • 80
  • 629
  • 1,111
  • HexRays seems to be a reasonable solution althouth the cost starts from $600. – Shayan Pooya Nov 01 '11 at 03:10
  • @Shayan: Yes, compared to free tools, the cost is fairly steep. OTOH, if you're using it professionally (or paying somebody else to), it can pay for itself *very* quickly. – Jerry Coffin Nov 01 '11 at 03:28
2

The latter question ("is there any intermediate representation that I can compile C++ code into and then back from it?") sounds like the AST produced by CLang.

MSalters
  • 173,980
  • 10
  • 155
  • 350
1

Perhaps you might be interested in learning more about internal representations used by GCC, in particular GIMPLE (and Tree-s). If you want to take advantage of GCC numerous processing around GIMPLE, you should consider writing a GCC plugin or a GCC MELT extension (MELT is a high-level domain specific language to easily extend GCC).

But all the middle-end internal representations of the C++ compilers I know about are quite far from the C++ source code, because the C++ front-end has already done a lot of work, and there is no easy way to go back to some useful C++.

faithful and complete decompilation is in practice nearly impossible, because the assembly code generated by a compiler has lost some knowledge from the original source code.

Basile Starynkevitch
  • 223,805
  • 18
  • 296
  • 547
0

Use objdump -d to disassemble a compiled object. Other than that you can't get much more information back out of it (and definitely not the original source). I'd trust the compiler if I were you.

jli
  • 6,523
  • 2
  • 29
  • 37