Python security: Danger of uncollected variables out of scope

Question

I have a method in a class which decrypts a variable, and returns it. I remove the returned variable with "del" after use.

What is the danger of these garbage values being accessed...and how can I best protect myself from them?

Here is the code:

import decrypter
import gc

# mangled variable names used
def decrypt(__var):
    __cleartext = decrypter.removeencryption(__var)
    return __cleartext

__p_var = "<512 encrypted password text>"
__p_cleartext = decrypt(__p_var)
<....do login with __p_cleartext...>
del  __p_var, __p_cleartext
gc.collect()

Could any of the variables, including __var and __cleartext be exploited at this point?

Thanks!

I've done a little more googling. Before I spend a few hours going down the wrong path...what I'm hearing is:

Store the password as a salted hash on the system (which it is doing now).
The salt for the hash should be entered in by the user at suite start (being done now)
However, the salt should be held in C process and not python.
The python script should pass the hash to to the C process for decryption.

The python script is handling the login for a mysql database, and the password is needed to open the DB connection.

If the code were along the lines of...

# MySQLdb.connect(host, user, password, database)
mysql_host = 'localhost'
mysql_db = 'myFunDatabase'
hashed_user = '\xghjd\xhjiw\xhjiw\x783\xjkgd6\xcdw8'
hashed_password = 'ghjkde\xhu78\x8y9tyk\x89g\x5de56x\xhyu8'
db = MySQLdb.connect(mysql_host, <call_c(hashed_user)>, <call_c(hashed_password)>, mysql_db])

Would this resolve (at least) the issue of python leaving garbage all over?

P.s. I also found the post about memset (Mark data as sensitive in python) but I'm assuming if I use C to decrypt the hash, this is not helpful.

P.P.S. The dycrypter is currentlt a python script. If I were to add memset to the script and then "compile" it using py2exe or pyinstaller....would this actually do anything to help protect the password? My instincts say no, since all pyinstaller does is package up the normal interpreter and the same bytecode the local interpreter creates...but I don;t know enough about it...?

So...following Aya's suggestion of making the encryption module in C, how much of a discernible memory footprint would the following setup leave. Part of the big issue is; the ability to decrypt the password must remain available throughout the run of the program as it will be called repeatedly...it's not a one-time thing.

Make a C object which is started when the user logins in. It contains the decryption routine and the holds a copy of the salt entered by the user at login. The stored salt is obscured in the running object (in memory) by having been hashed by it's own encryption routine using a randomly generated salt.

The randomly generated salt would still have to be held in a variable in the object too. This is not really to secure the salt, but just to try and obfuscate the memory footprint if someone should take a peek at it (making the salt hard to identify). I.e. c-obj

mlock() /*to keep the code memory resident (no swap)*/

char encrypt(data, salt){ 
    (...) 
    return encrypted_data
}

char decrypt(data, salt){ 
    (...) 
    return decrypted_data
}

stream_callback(stream_data){
    return decrypt(stream_data, decrypt(s-gdhen, jhgtdyuwj))
}

void main{ 
    char jhgtdyuwj=rand();
    s-gdhen = encrypt(<raw_user_input>, jhgtdyuwj);
}

Then, the python script calls the C object directly, which passes the unencrypted result right into the MySQLdb call without storing any returns in any variable. I.e.

#!/usr/bin/python
encrypted_username = 'feh9876\xhu378\x&457(oy\x'
encrypted_password = 'dee\x\xhuie\xhjfirihy\x^\xhjfkekl'
# MySQLdb.connect(host, username, password, database)
db = MySQLdb.connect(self.mysql_host,
                     c-obj.stream_callabck(encrypted_username),
                     c-obj.stream_callback(encrypted_password),
                     self.mysql_database)

What kind of memory footprint might this leave which could be snooped?

It's worth noting that removing the name doesn't guarantee the object will be removed - other names to it could still exist. — Gareth Latty, May 27 '13 at 16:37
If your concern is unauthorized access to the cleartext password, there are probably many ways to access it after it's supposedly been gc'd. If you're particularly paranoid, you may want to do that part in C, and overwrite the RAM with random data afterwards. — Aya, May 27 '13 at 16:52
Regarding your edits: even if you implemented the whole thing in C, and zeroed the process address space when you were done, there's no guarantee that the OS hasn't paged out the section of the address space which contained the cleartext password, so it could potentially remain on disk for a very long time. And even if you disabled virtual memory, if someone can access the memory space of the process, there will always be some opportunity to grab the cleartext password. TBH, I wouldn't worry about it - if the password is just for MySQL, there are much easier ways to bypass its security. — Aya, May 28 '13 at 12:38
Is there some reason you're particularly concerned about password security for this? Is it likely that random people will have access to the system which is running this Python code? — Aya, May 28 '13 at 12:59
@aya: OK. Not sure how much info you'll need. I am mostly concerned about people ending up on the system, although it is a hardened centos6. For reasons I won't go into, a process will need to log in and out of a database (MySQL) repeatedly. The username AND password for the database is held in a file as 512bit encrypted text. To un-encrypt, you need the same "salt" used to encrypt the text. The salt is not kept on the system, but has to be entered by the user into the script at startup. Thus it does have to be held in the script, and the user/pass decryoted each login. — RightmireM, May 29 '13 at 08:28
@aya: My biggest concern is; if a user gains access (even root) to the system...there's nothing in a file that will help them. I don't even think the MySQLdb exploit you used would work, because the script won't pass the login credentials without the salt...which has to be passed in when the script is started. BUT, the salt will live in a variable in memory. As far as I can tell, the only way to get this variable would be to sniff out or dump out the memory (or check the swap file...but using memlock in C should prevent it from being swapped). So, thats what I'm trying to protect. — RightmireM, May 29 '13 at 08:28

Raymond Hettinger · Answer 1 · 2013-05-27T17:01:29.263

3

If no other references to the value exist, the your gc.collect normally destroys the object.

However, something as simple as string interning or cacheing may keep an unexpected reference, leaving the value alive in memory. Python has a number of implementations (PyPy, Jython, PyPy) that do different things internally. The language itself makes very few guarantees about whether or when the value would actually get erased from memory.

In your example, you also use name mangling. Because the mangling is easily reproduced by hand, this doesn't add any security at all.

One further thought: It isn't clear what your security model is. If the attacker can call your decrypt function and run arbitrary code in the same process, what would prevent them from wrapping decrypt to keep a code of the inputs and outputs.

edited May 27 '13 at 17:01

answered May 27 '13 at 16:47

Raymond Hettinger

216,523
63
388
485

The decrypt actually uses a salt-key which is held in another running process. This process is started when the user logs in and the salt has to be actively passed in from the terminal (it doesn't live on a file in the server). Since the variable in this processes must remain alive, it's probably at more risk to be discovered...but it's inevitable that all the components be available at one time at some point. I'm just hoping by spreading them around a bit, I obfuscate it. – RightmireM May 27 '13 at 17:36
I chose mangled variables mostly to prevent the variable name being guessed. AKA I assumed "password = decrypt(encoded_password)" is easier to sniff out than "__hidden_p_varble_weird_name__ = decrypt(__inputpasswordstringvarfromlocarea51)". But I'm new to writing secure python...so, maybe this means nothing. – RightmireM May 27 '13 at 17:36
@BurningKrome It does mean nothing. Name mangling is of zero security value. The purpose of name mangling to provide a standard pattern for creating class local references (a reference inside a class that can be presumed to not be overridden by subclasses). – Raymond Hettinger May 27 '13 at 18:02
@Aya: Please see my edits above. Am I on the right learning track here? Thanks! – RightmireM May 28 '13 at 13:40
@BurningKrome Can you reply to my comments on the question, rather than on this answer, otherwise I don't get notified. – Aya May 28 '13 at 16:30

Radiance Wei Qi Ong · Answer 2 · 2013-05-27T16:59:40.590

Even if you call gc.collect and those strings are deallocated, they might still remain in memory. Also, strings are immutable, which means you have no (standard) way of overwriting them. Also note that if you have performed operations on those strings some copies of them might be lying around.

So don't use strings if possible.

You need to overwrite the memory (and even then, the memory might be dumped somewhere, like into a page file). Use a byte-array and overwrite the memory when you're done.

Aya · Accepted Answer · 2013-05-29T13:28:30.963

Any security system is only as strong as its weakest link.

It's difficult to tell what the weakest link is in your current system, since you haven't really given any details on the overall architecture, but if you're actually using Python code like you posted in the question (let's call this myscript.py)...

#!/usr/bin/python
encrypted_username = 'feh9876\xhu378\x&457(oy\x'
encrypted_password = 'dee\x\xhuie\xhjfirihy\x^\xhjfkekl'
# MySQLdb.connect(host, username, password, database)
db = MySQLdb.connect(self.mysql_host,
                     c-obj.stream_callabck(encrypted_username),
                     c-obj.stream_callback(encrypted_password),
                     self.mysql_database)

...then regardless of how or where you decrypt the password, any user can come along and run a script like this...

import MySQLdb

def my_connect(*args, **kwargs):
    print args, kwargs
    return MySQLdb.real_connect(*args, **kwargs)

MySQLdb.real_connect = MySQLdb.connect
MySQLdb.connect = my_connect
execfile('/path/to/myscript.py')

...which will print out the plaintext password, so implementing the decryption in C is like putting ten deadbolts on the front door, but leaving the window wide open.

If you want a good answer on how to secure your system, you'll have to provide some more information on the overall architecture, and what attack vectors you're trying to prevent.

If someone manages to hack root, you're pretty much screwed, but are better ways to conceal the password from non-root users.

However, if you're satisfied that the machine you're running this code on is secure (in the sense that it can't be accessed by any 'unauthorized' users), then none of this password obfuscation stuff is necessary - you may as well just put the cleartext password directly into the Python source code.

Update

Regarding architecture, I meant, how many separate servers are you running, what responsibilities do they have, and how are they meant to communicate with each other, and/or the outside world?

Assuming the primary goal is to prevent unauthorized access to the MySQL server, and assuming MySQL runs on a different server to the Python script, then why are you more concerned about someone gaining access to the server running the Python script, and getting the password for the MySQL server, rather than gaining access to the MySQL server directly?

If you're using a 'salt' as a decryption key for the encrypted MySQL password, then how does an authorized user pass that value to the system? Do they have to login to the server via, say, ssh, and run the script from the commandline, or it this something accessible via, say, a webserver?

Either way, if someone does compromise the system running the Python script, they merely have to wait until the next authorized user comes along, and 'sniff' the 'salt' they enter.

OK. Not sure how much info you'll need. I am mostly concerned about people ending up on the system, although it is a hardened centos6. For reasons I won't go into, a process will need to log in and out of a database (MySQL) repeatedly. The username AND password for the database is held in a file as 512bit encrypted text. To un-encrypt, you need the same "salt" used to encrypt the text. The salt is not kept on the system, but has to be entered by the user into the script at startup. Thus it does have to be held in the script, and the user/pass decryoted each login. — RightmireM, May 28 '13 at 20:25
My biggest concern is; if a user gains access (even root) to the system...there's nothing in a file that will help them. I don't even think the MySQLdb exploit you used would work, because the script won't pass the login credentials without the salt...which has to be passed in when the script is started. BUT, the salt will live in a variable in memory. As far as I can tell, the only way to get this variable would be to sniff out or dump out the memory (or check the swap file...but using memlock in C should prevent it from being swapped). So, thats what I'm trying to protect. — RightmireM, May 28 '13 at 20:31
Thanks. Good points :-) The server is standalone LAMP. Running the MySQL database, and a web server...along with the python script. It does not communicate with other servers. The plan is for user to be able to login via web. Still working on that security issue :-) Honestly, there's two thoughts behind this question: 1. Although this is a real server handling real (and very important) user data...this is also an educational experience for me in running secure scripting. — RightmireM, May 29 '13 at 15:50
2. My security philosophy is; no single point of entry is ever secure (sometimes even if you shut the server off :D). BUT...the more levels of security there are; the better. — RightmireM, May 29 '13 at 15:54
I.e. First the cracker has to get through the door (onto the server), then try to crack the encrypted files (and discover no useful data there), then try to crack the memory (hopefully fails), then wait for a user to log in so s/he can sniff the datastream. Just like a real house, security lays not in building Fort Knox...but it making it so difficult, frustrating, and time consuming to get to the jewels...that by the time s/he has, they have given up or gotten caught :-) Does that make sense? — RightmireM, May 29 '13 at 15:55
@BurningKrome Well, if MySQL runs on the same server as the webserver, then it's always going to be vulnerable if the user can crack root, regardless of the password, since they can just access the DB data directly via `/var/lib/mysql` or wherever you keep it. Given that an unknown 'salt' (which effectivly becomes the password) is required to access MySQL, a SQL injection attack is unlikely, although I wonder how you plan to authenticate users without accessing a MySQL DB? Assuming that's the case, and given everything other than port 80 is firewalled, the most likely point of entry... — Aya, May 29 '13 at 16:05
@BurningKrome ...would be either another form of injection attack (like an unchecked call to `os.system()`), or a buffer overflow attack, with the end goal of being able to execute arbitrary code as the webserver UID. What they do from there would depend on how the system authenticates users, and what they're actually after. — Aya, May 29 '13 at 16:11

Python security: Danger of uncollected variables out of scope

3 Answers3