I want to decompile a DLL that I believe was written in C. How can I do this?
3 Answers
Short answer: you can't.
Long answer: The compilation process for C/C++ is very lossy. The compiler makes a whole lot of high and low level optimizations to your code, and the resulting assembly code more often than not resembles nothing of your original code. Furthermore there are different compilers in the market (and each has several different active versions), which each generate the output a little differently. Without knowledge of which compiler was used the task of decompiling becomes even more hopeless. At the best I've heard of some tools that can give you some partial decompilation, with bits of C code recognized here and there, but you're still going to have to read through a lot of assembly code to make sense of it.
That's by the way one of the reasons why copy protections on software are difficult to crack and require special assembly skills.

- 104,512
- 87
- 279
- 422
-
1Note: .NET and Java are completely different beasts, and decompilations is possible to a far greater degree there (though code obfuscators exist in the market that can befuddle decompilers). – Vilx- Dec 22 '11 at 09:06
-
If I know that the C DLL is accessing a certain `.jar` files then surely these would need to be hardcoded into the DLL and would come up in a decompilation? – Gary Jones Dec 22 '11 at 09:10
-
1Hard to say. It's possible to add arbitrary "resources" into a .DLL/.EXE file. If it's stored that way then you don't need to decompile - there are resource viewer tools that can extract it for you. Then you can decompile the .jar file which is Java. But there are other homebrew ways of embedding files into .dll files, and if it uses one of those, you're out of luck. Maybe you can grab these files while the program is running and they are extracted in a temporary folder or something? – Vilx- Dec 22 '11 at 09:16
It is possible, but extremely difficult and will take ginormous amount of time even if you're pretty well versed in C, assembly and the intricacies of the operating system where this code is supposed to work.
The problem is, optimization makes compiled code hardly recognizable/understandable for humans.
Further, there will be ambiguities if the disassembler loses information (e.g. the same instruction can be encoded in different ways and if the rest of the code depends on a particular encoding which many disassemblers (or their users) fail to take into account, the resultant disassembly becomes incomplete or incorrect).
Self-modifying code complicates the matters as well.
See in this question more on the topic and available tools.

- 1
- 1

- 61,140
- 12
- 83
- 180
-
1Self-modifying code was hip on the C64. Modern processors will have problems with caching and multithreading when code modifies itself. I'm not sure if C/C++ compilers can generate self-modifying code (but you still get +1) – LittleFunnyMan Dec 22 '11 at 16:10
You can, but only up to a certain extent:
- Optimizations could change the code
- Symbols might have been stripped (DLL allows to refer to functions residing inside via index instead of symbol)
- Some instruction combinations might not be convertible to C
- and some other things I might forget...

- 9,192
- 4
- 24
- 38