2

I'm trying to set up encryption on a portion of a project that I'm working on. Right now, I'm trying an implementation similar to this. As an extra layer of security, I have all of my Python source code compiled into .pyd files via Cython. What I want to do is unencrypt a .pyd module, import it, and then delete the unencrypted file immediately afterwords while keeping the module available in memory. How do I do this?


Here is the code I'm using for encryption/decryption:
from hashlib import md5
from Crypto.Cipher import AES
from Crypto import Random

def derive_key_and_iv(password, salt, key_length, iv_length):
    d = d_i = ''.encode()
    while len(d) < key_length + iv_length:
        d_i = md5(d_i + password.encode() + salt).digest()
        d += d_i
    return d[:key_length], d[key_length:key_length+iv_length]

def encrypt(in_file, out_file, password, key_length=32):
    bs = AES.block_size
    salt = Random.new().read(bs - len('Salted__'))
    key, iv = derive_key_and_iv(password, salt, key_length, bs)
    cipher = AES.new(key, AES.MODE_CBC, iv)
    out_file.write('Salted__'.encode() + salt)
    finished = False
    while not finished:
        chunk = in_file.read(1024 * bs)
        if len(chunk) == 0 or len(chunk) % bs != 0:
            padding_length = (bs - len(chunk) % bs) or bs
            chunk += (padding_length * chr(padding_length)).encode()
            finished = True
        out_file.write(cipher.encrypt(chunk))

def decrypt(in_file, out_file, password, key_length=32):
    bs = AES.block_size
    salt = in_file.read(bs)[len('Salted__'):]
    key, iv = derive_key_and_iv(password, salt, key_length, bs)
    cipher = AES.new(key, AES.MODE_CBC, iv)
    next_chunk = b''
    finished = False
    while not finished:
        chunk, next_chunk = next_chunk, cipher.decrypt(in_file.read(1024 * bs))
        if len(next_chunk) == 0:
            padding_length = chunk[-1]
            chunk = chunk[:-padding_length]
            finished = True
        out_file.write(chunk)

I am using a test module called test_me_pls with the following code, which is compiled into test_me_pls.pyd:

def square(x):
    return x * x


To test, I have done the following:
>>> import os
>>> origpath = r'C:\test\test_me_pls.pyd'
>>> encpath = r'C:\test\blue.rrr'
>>> decpath = r'C:\test\test_me_pls.pyd'
>>> key = 'hello'
>>> with open(origpath,'rb') as inf, open(encpath,'wb') as outf:
...     encrypt(inf,outf,key)
>>> os.remove(origpath)
>>> import test_me_pls
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
ImportError: No module named 'test_me_pls'

So far so good. I got the file encrypted and deleted the unencrypted .pyd file. Just to make sure it worked how I was expecting, I tried to import the deleted/encrypted module and the import failed as expected. Next, I decrypt, import the module, and test it out:

>>> with open(encpath,'rb') as inf, open(decpath,'wb') as outf:
...     decrypt(inf,outf,key)
>>> import test_me_pls
>>> test_me_pls.square(5)
25

Okay, decryption worked. Now that I have the module imported, I want to delete the unencrypted .pyd file:

>>> os.remove('test_me_pls.pyd')
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
PermissionError: [WinError 5] Access is denied: 'test_me_pls.pyd'

Okay. Not what I was hoping for. And just to beat a dead horse:

>>> test_me_pls.__file__
'.\\test_me_pls.pyd'
>>> os.remove(test_me_pls.__file__)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
PermissionError: [WinError 5] Access is denied: '.\\test_me_pls.pyd'

Is there a way to do this?


NOTE: I understand that if someone really wants to get at my code, it doesn't matter that I have it compiled and encrypted because if they know what they're doing they can still get to the unencrypted .pyd file by using a debugger and setting a break point at the appropriate spot, uncompiling the .pyd file, etc. etc. insert reverse engineering techniques here. What I do want is something that will prevent casual attempts to look at the code in this module. The lock on the front door to my house can easily be picked by someone with the right knowledge and tools, but that doesn't mean I don't want to bother locking my door.

2nd NOTE: If it makes any difference, I'm using Python 3.3

Community
  • 1
  • 1
johnny_be
  • 299
  • 1
  • 2
  • 11
  • *"something that will prevent casual attempts to look at the code in this module"* - you mean like distributing a `.pyd`? – jonrsharpe Jul 15 '15 at 22:05
  • @jonrsharpe Yes. I am doing that, but I want one more extra layer of annoyance that somebody would have to work through to get at the code. – johnny_be Jul 15 '15 at 22:39
  • 1
    An interesting approach. A guess, but I think what happens if that you are hitting OS limits: on Windows you can't remove a file if it is opened by some process. I'm pretty sure that this would work fine on Linux. What you could do is a) figure out how you can manually close all references to the `.pyd` file so it can be removed b) see if there is a way around this OS level limitation. – rth Jul 15 '15 at 22:57
  • related: [Dropbox used a modified python interpreter to import encrypted Python code](https://github.com/kholia/dedrop) – jfs Jul 16 '15 at 12:44

1 Answers1

0

I found a workaround that ignores the specific PermissionError problem above but that solves the problem of decrypting a Python module and then running it from within Python without leaving a decrypted copy of the module on the disk for any amount of time.

Using this answer as a starting point, I decided to modify the decrypt function above to return a string rather than write to a file. Then I can import the string as a module and run the code from within Python--all while never saving the decrypted module to disk. Unfortunately, in order to accomplish this, I had to start with an encrypted .py file rather than an encrypted and compiled .pyd file like I originally wanted. (If anyone knows how to import a bytes object as a module, please let me know!)


Anyway, here are the specific changes I made to get it working.

First, the new decrypt function:

def decrypt(in_file, password, key_length=32):
    bs = AES.block_size
    salt = in_file.read(bs)[len('Salted__'):]
    key, iv = derive_key_and_iv(password, salt, key_length, bs)
    cipher = AES.new(key, AES.MODE_CBC, iv)
    next_chunk = b''
    b = b''
    finished = False
    while not finished:
        chunk, next_chunk = next_chunk, cipher.decrypt(in_file.read(1024 * bs))
        if len(next_chunk) == 0:
            padding_length = chunk[-1]
            chunk = chunk[:-padding_length]
            finished = True
        b += chunk
    return b.decode() # Note: will only work if unencrypted file is text, not binary

Just to be thorough, let's encrypt the original file, delete the unencrypted file, and then decrypt, returning the source to a string, s:

>>> import os
>>> origpath = r'C:\test\test_me_pls.py'
>>> encpath = r'C:\test\blue.rrr'
>>> key = 'hello'
>>> with open(origpath,'rb') as inf, open(encpath,'wb') as outf:
...     encrypt(inf,outf,key)
>>> os.remove(origpath)
>>> import test_me_pls
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
ImportError: No module named 'test_me_pls'
>>> with open(encpath,'rb') as inf:
...     s = decrypt(inf,key)

Great! No unencrypted copy of the source is saved to disk. Now to import the string as a module:

>>> import sys, imp
>>> test_me_pls = imp.new_module('test_me_pls')
>>> exec(s, test_me_pls.__dict__)
>>> test_me_pls.square(7)
49

Then, for good measure, we can add the module to sys to get it to ignore subsequent attempts to import:

>>> sys.modules['test_me_pls'] = test_me_pls

And then because making it more difficult to read the source code was our original goal, let's get rid of the string that contains it:

>>> del s

And finally, just as a sanity check, make sure the module still works after the source string is gone:

>>> test_me_pls.square(3)
9


If anyone has any alternative solutions (especially how to fix that Windows PermissionError problem or import a binary .pyd from a bytes object rather than from a file), please let me know!
Community
  • 1
  • 1
johnny_be
  • 299
  • 1
  • 2
  • 11
  • just wondering if u had any more luck with this issue since 16... any idea if pyd can be now imported from str/byte maybe? I tried lately but I failed badly, then again I'm not very good ^^ TIA – Dariusz Jan 08 '21 at 22:31
  • @Dariusz I never figured out how to do it starting from `.pyd`, unfortunately. – johnny_be Jan 11 '21 at 18:54