24

Are there any ways to decompile a dll and/or a .pyd file in order to extract source code written in Python?

Thanks in advance

Youssef Imam
  • 243
  • 1
  • 2
  • 4
  • 4
    is not same question(dll not a .pyc file ) ! My idea No can't ! Maybe convert to assembly code but how to follow data/function tree ? – dsgdfg Feb 24 '16 at 14:23
  • Thanks for your assistance. – Youssef Imam Feb 25 '16 at 16:06
  • 1
    @Torxed this is indeed not the same question, pyd files are produced by Cython. Unlike pyc/bytecode, the pyd files are binaries that do not contain the original python source. Although it does not protect from reverse-engineering, it does make it a lot harder, and prevents someone from obtaining the exact original sources of your code. – Overdrivr Mar 28 '18 at 15:15

2 Answers2

22

I assume the .pyd/.dll files were created in Cython, not Python?

Anyway, generally it's not possible, unless there's a decompiler designed specifically for the language the file was originally compiled from. And while I know about C, C++, Delphi, .NET and some other decompilers, I've yet to hear about Cython decompiler.

Of course, what Cython does is convert your Python[esque] code into C code first, which means you might have more luck finding a C decompiler and then divining the original Python code based on the decompiled C code. At the very least, this way you'll be dealing with translation from one (relatively) high-level language to another.

Worst-case scenario, you'll have to use a disassembler. However, recreating Python code from disassembler's output isn't going to be easy (pretty similar to divining the biological functions of a brain from chemical formulas of proteins that make up it's cells).

You might look at this question on ideas and suggestions regarding various decompilers and disassemblers, and proceed your investigation from there.

Community
  • 1
  • 1
Lav
  • 2,204
  • 12
  • 23
  • Thanks for the helpful reply, I'll make sure to do some extra research. – Youssef Imam Feb 25 '16 at 16:07
  • @YoussefImam I don't agree with this answer, see my answer http://stackoverflow.com/a/41075212/1422096 – Basj Dec 10 '16 at 11:53
  • @Basj Reading your updated answer, it seems that `.pyd` files compiled by Cython do not have source code embedded into them after all. If you could show it to be otherwise, I would be really interested, but until then, I'll stand by my answer. – Lav Dec 12 '16 at 11:56
  • @Basj, is it possible to reverse engineer the C-code then, there is this software that I need to reverse engineer, it is written in Python and has .pyd files in it. – Krishnan Venkiteswaran Sep 23 '17 at 16:08
2

I don't agree with the accepted answer, it seems that yes, the content of the source code is accessible even in a .pyd.

Let's see for example what happens if an error arrives:

1) Create this file:

whathappenswhenerror.pyx

A = 6 
print 'hello'
print A
print 1/0 # this will generate an error

2) Compile it with python setup.py build:

setup.py

from distutils.core import setup
from Cython.Build import cythonize
setup(ext_modules = cythonize("whathappenswhenerror.pyx"), include_dirs=[])

3) Now import the .pyd file in a standard python file:

testwhathappenswhenerror.py

import whathappenswhenerror

4) Let's run it with python testwhathappenswhenerror.py. Here is the output:

hello 
6 
Traceback (most recent call last):
  File "D:\testwhathappenswhenerror.py", line 1, in <module>
    import whathappenswhenerror
  File "whathappenswhenerror.pyx", line 4, in init whathappenswhenerror (whathappenswhenerror.c:824)
    print 1/0 # this will generate an error 
ZeroDivisionError: integer division or modulo by zero

As you can see the line of code print 1/0 # this will generate an error that was in the .pyx source code is displayed! Even the comment is displayed!

4 bis) If I delete (or move somewhere else) the original .pyx file before step 3), then the original code print 1/0 # this will generate an error is no longer displayed:

hello
6
Traceback (most recent call last):
  File "D:\testwhathappenswhenerror.py", line 1, in <module>
    import whathappenswhenerror
  File "whathappenswhenerror.pyx", line 4, in init whathappenswhenerror (whathappenswhenerror.c:824)
ZeroDivisionError: integer division or modulo by zero

But does this mean it's not included in the .pyd? I'm not sure.

Basj
  • 41,386
  • 99
  • 383
  • 673
  • 8
    Moving the `pyx` shows that the Traceback uses some sort of link from the `pyd` (`so` in Linux) to lines in the `pyx`. The code is not in the `pyd`. – hpaulj Dec 10 '16 at 18:49
  • 4
    @Basj, if the pyx file is in same location and when you do `import whathappenswhenerror`, how do you know it's importing `.pyd` and not `pyx`? Probably it's importing `pyx` and hence the code display on error. – krsoni May 22 '18 at 10:22
  • 2
    You can tell that it is using the `pyx` file by looking at the traceback (`File "whathappenswhenerror.pyx", line 4`) – Minion Jim Mar 09 '19 at 17:43
  • 2
    Precisely, this answer is not correct. It seems, it's importing from .pyx – kursun Sep 07 '20 at 09:08
  • 3
    [An answer to a different question that convincingly shows that this answer is wrong](https://stackoverflow.com/a/62389335/4657412) – DavidW Dec 09 '20 at 08:59
  • @Basj: Your answer shows that you have never done reverse engineering. – Elmue Jul 13 '23 at 21:13
  • @Elmue I wonder what is the purpose of your comment? Make me feel inferior? I doubt this comment about what I have done or not done in my life is really useful for our community... – Basj Jul 14 '23 at 05:59
  • Your answer is completely wrong. And your answer does show how to get Python code back. Just displaying the name of a *.py file is NOT code! A stacktrace is not code. It is not possible to decompile Python code to the original. If you would ever have done REAL reverse engineering you would have opened a PYD file in a disassembler like IDA pro and see that you get nothing of the original code back. – Elmue Jul 15 '23 at 20:46
  • @Elmue Ok, fine, then just downvote my answer. No need to elaborate about "what I have done" or "never done". `It is not possible to decompile Python code to the original`: *this* sentence is wrong. You can, for example with a .pyc file (different context from OP's question). I have already done it. – Basj Jul 21 '23 at 15:08
  • Nobody who sells a Python application will be so stupid to provide the PYC files. They are not available. And if you read the question correctly it asks for DLL/PYD files. The question does not ask for PYC files. So your answer stays wrong. You can delete it. – Elmue Jul 24 '23 at 19:15
  • @Elmue This is not true. For your information, many applications were distributed that included .pyc file (or in a derivative way). Please read posts about decompilation of early versions of the Dropbox Windows client (coded in Python, with modified interpreter ; but still some people achieved a decompilation). Anyway this is out of topic, so I stop the conversation here. – Basj Aug 08 '23 at 08:50