1

I have written a python code which takes an input data file, performs some processing on the data and writes another data file as output.

I should distribute my code now but the users should not see the source code but be able to just giving the input and getting the output! I have never done this before.

I would appreciate any advice on how to achieve this in the easiest way.

Thanks a lot in advance

user8224662
  • 131
  • 2
  • 6
  • 13

5 Answers5

3

As Python is an interpreted language by design; and as it compiles code to a bytecode (- which doesn't help the fact you're trying to conceal it, as bytecodes are easier to reverse -) there's no real secure way to hide your source code whereby it is not recoverable, as is true for any programming language, really.

Initially, if you'd wanted to work with a language that can't be so easily reversed- you should've gone for a more native language which compiles directly to the underlying architecture's machine code which is significantly harder to reproduce in the original language let alone read due to neat compiler optimizations, the overhead given by CISC et cetera.

However, some libraries that do convert your source code into an executable format (by packing the Python interpreter and the bytecode alongside it) can be used such as:

  • cx_Freeze - for freezing any code >=Python 2.7 for any platform, allegedly.
  • PyInstaller - for freezing general purpose code, it does state additionally that it works with third-party libraries.
  • py2exe -for freezing code into Windows-only executable format.

Or you might consider a substitute for this, which is code obfuscation which still allows the user to read the source code however make it near-to-impossible to read.

However, an issue brought up with this is that, it'd be harder for code addition as bad code obfuscation techniques could make the code static. Also, on the latter case, the code could have overhead brought by redundant code meant to fool or trick the user into thinking the code is doing something which it is not.

Also in general it negates the standard practice of open-source which is what Python loves to do and support.

So to really conclude, if you don't want to read everything above; the first thing you did wrong was choose Python for this, a language that supports open source and is open source as well. Thus to mitigate the issue you should either reconsider the language, or follow the references above to links to modules which might help aide basic source code concealment.

uspectaculum
  • 391
  • 2
  • 9
  • Does pyinstaller hide the source code then after converting to .exe? – Lightsout Feb 13 '21 at 02:58
  • PyInstaller seems to hide the source code. I could not find any *.py in the output folder. (I created a folder, not a single file.) – Günter Mar 04 '22 at 19:22
2

Firstly, as Python is an interpreted language, I think you cannot completely protect your Python code, .pyc files can be uncompiled to get back .py files (using uncompyle6 for example).

So the only thing you can do is make it very hard to read.

I recommend to have a look at code obfuscation, which consists in making your code unreadable by changing variables/function names, removing comments and docstrings, removing useless spaces, etc. Pyminifier does that kind of things. You can also write your own obfuscation script.

Then you can also turn your program into a single executable (using pyinstaller for example). I am pretty sure there is a way to get .py files back from the executable, but it just makes it harder. Also beware of cross-platform compatibility when making an executable.

Marv
  • 165
  • 8
2

Going through above responses, my understanding is that some of the strategies mentioned may not work if your client wants to execute your protected script along with other unprotected scripts.

One other option is to encrypt your script and then use an interpreter that can decrypt and execute it. It too has some limitations.

ipepycrypter is a suite that helps protect python scripts. This is accomplished by hiding script implementation through encryption. The encrypted script is executed by modifed python interpreter. ipepycrypter consists of encryption tool ipepycrypt and python interpreter ipepython.

More information is available at https://ipencrypter.com/user-guides/ipepycrypter/

Taher A. Ghaleb
  • 5,120
  • 5
  • 31
  • 44
ipe
  • 21
  • 1
1

One other option, of course, is to expose the functionality over the web, so that the user can interact through the browser without ever having access to the actual code.

greymatter
  • 840
  • 1
  • 10
  • 25
1

There are several tools which compile Python code into either (a) compiled modules usable with CPython, or (b) a self-contained executable.

https://cython.org/ is the best known, and probably? oldest, and it only takes a very small amount of effort to prepare a traditional Python package so that it can be compiled with Cython.

http://numba.pydata.org/ and https://pythran.readthedocs.io/ can also be used in this way, to produce Python compiled modules such that the source doesnt need to be distributed, and it will be very difficult to decompile the distributable back into usable source code.

https://mypyc.readthedocs.io is newer player, an offshoot of the mypy toolkit.

Nuitka is the most advanced at creating a self-contained executable. https://github.com/Nuitka/Nuitka/issues/392#issuecomment-833396517 shows that it is very hard to de-compile code once it has passed through Nuitka.

https://github.com/indygreg/PyOxidizer is another tool worth considering, as it creates a self-contained executable of all the needed packages. By default, only basic IP protection is provided, in that the packages inside it are not trivial to inspect. However for someone with a bit of knowledge of the tool, it is trivial to see the packages enclosed within the binary. However it is possible to add custom module loaders, so that the "modules" in the binary can be stored in unintelligible formats.

Finally, there are many Python to C/go/rust/etc transpilers, however these will very likely not be usable except for small subsets of the language (e.g. will 3/0 throw the appropriate exception in the target language?), and likely will only support a very limited subset of the standard library, and are unlikely to support any imports of packages beyond the standard library. One example is https://github.com/py2many/py2many , but a search for "Python transpiler" will give you many to consider.

John Vandenberg
  • 474
  • 6
  • 16