53

How do you store a password entered by the user in memory and erase it securely after it is no longer need?

To elaborate, currently we have the following code:

username = raw_input('User name: ')
password = getpass.getpass()
mail = imaplib.IMAP4(MAIL_HOST)
mail.login(username, password)

After calling the login method, what do we need to do to fill the area of memory that contains password with garbled characters so that someone cannot recover the password by doing a core dump?

There is a similar question, however it is in Java and the solution uses character arrays: How does one store password hashes securely in memory, when creating accounts?

Can this be done in Python?

Community
  • 1
  • 1
maxyfc
  • 11,167
  • 7
  • 37
  • 46
  • 3
    Near the bottom of this [IBM article](http://www.ibm.com/developerworks/library/s-data.html?n-s-311), they talk about using a mutable data structure instead of an immutable string. – Emile Cormier Apr 07 '15 at 05:33
  • 2
    The link to the IBM article in the comment above doesn't work anymore, use an [archived page](https://web.archive.org/web/20150308000622/http://www.ibm.com/developerworks/library/s-data.html). – Yaroslav Nikitenko Jan 05 '17 at 17:33
  • I was trying to achieve something similar and came across this : https://www.sjoerdlangkemper.nl/2016/06/09/clearing-memory-in-python/ – mr_pool_404 Aug 10 '19 at 07:43

7 Answers7

56

Python doesn't have that low of a level of control over memory. Accept it, and move on. The best you can do is to del password after calling mail.login so that no references to the password string object remain. Any solution that purports to be able to do more than that is only giving you a false sense of security.

Python string objects are immutable; there's no direct way to change the contents of a string after it is created. Even if you were able to somehow overwrite the contents of the string referred to by password (which is technically possible with stupid ctypes tricks), there would still be other copies of the password that have been created in various string operations:

  • by the getpass module when it strips the trailing newline off of the inputted password
  • by the imaplib module when it quotes the password and then creates the complete IMAP command before passing it off to the socket

You would somehow have to get references to all of those strings and overwrite their memory as well.

Miles
  • 31,360
  • 7
  • 64
  • 74
  • 16
    Not to mention the possibility that the OS will swap your whole memory page out to disk, where it could sit for months. – JasonSmith Apr 08 '09 at 03:22
  • 3
    The swap issue is not python specific ofc, but here is a discussion about that part: https://security.stackexchange.com/questions/29350/swap-file-may-contain-sensitive-data – Zitrax Jan 19 '18 at 10:13
  • If they can read the page file, your problems are way bigger than a password in the heap. – doug65536 Oct 22 '22 at 11:41
22

There actually -is- a way to securely erase strings in Python; use the memset C function, as per Mark data as sensitive in python

Edited to add, long after the post was made: here's a deeper dive into string interning. There are some circumstances (primarily involving non-constant strings) where interning does not happen, making cleanup of the string value slightly more explicit, based on CPython reference counting GC. (Though still not a "scrubbing" / "sanitizing" cleanup.)

amcgregor
  • 1,228
  • 12
  • 29
  • 2
    Note that this is OS-dependent. Windows and Linux code is given in the linked post. – Luc Nov 09 '16 at 19:11
  • 1
    It's also highly dependent on internal interpreter details such as: id having the same value as the object pointer, the offset of string data from the object pointer, etc. Incredibly brittle; do not recommend. – Conrad Meyer Dec 21 '18 at 23:40
  • 1
    @ConradMeyer Of course it is. While this may be abstractly considered "brittle", and certainly no-one is recommending it, it does answer the question of "is this possible" better than the currently accepted answer beginning with "Python doesn't have that low of a level of control over memory. Accept it, and move on." which is absolutely false and unhelpful, as immediately demonstrated by the existence of `ctypes`. This solution is actually even worse than you might be suggesting; you would be modifying hashed data values application-wide and destroying the ability to represent certain strings. – amcgregor Dec 24 '18 at 03:07
  • 1
    I find the argument this answers "is it possible" better than the accepted answer pretty silly. As you mention, it totally breaks the interpreter; and additionally, it doesn't work with any other regular Python string functionality or libraries that make copies or temporary values. And it relies on something with even weaker type safety / warnings / errors than regular C. So you're better off just using C in the first place. I wouldn't characterize that as "possible in Python." I'm also not happy that the first answer is the correct one, but unfortunately, it is. – Conrad Meyer Dec 25 '18 at 04:11
  • @ConradMeyer "Just use C in the first place." No. Honestly, memory scrubbing has never actually come up in my career developing Python web applications and devops systems, despite the fact that Heartbleed was actively wielded against my infrastructure. (Edited to add: all things are possible. Not all things are reasonable. "Can this be done?" is a yup. Next up: "Should this be done?" Probably not unless you have very explicit needs. **That's a different question.**) – amcgregor Oct 17 '22 at 11:55
5

The correct solution is to use a bytearray() ... which is mutable, and you can safely clear keys and sensitive material from RAM.

However, there are some libraries, notably the python "cryptography" library that prevent "bytearray" from being used. This is problematic... to some extent these cryptographic libraries should ensure that only mutable types be used for key material.

There is SecureString which is a pip module that allows you to fully remove a key from memory...(I refactored it a bit and called it SecureBytes). I wrote some unit tests that demonstrate that the key is fully removed.

But there is a big caveat: if someone's password is "type", then the word "type" will get wiped from all of python... including in function definitions and object attributes.

In other words... mutating immutable types is a terrible idea, and unless you're extremely careful, can immediately crash any running program.

The right solution is: never use immutable types for key material, passwords, etc. Anyone building a cryptographic library or routine like "getpass" should be working with a "bytearray" instead of python strings.

Erik Aronesty
  • 11,620
  • 5
  • 64
  • 44
  • As a follow up to this I ported the SecureString to work with integers and bytes (called SecureBytes). Both are horribly unsafe unless you are careful to work with crptographic key material... and not immutable things that could propagate to the rest of python. Tested on win/mac/linux. – Erik Aronesty Jan 08 '19 at 15:46
4

If you don't need the mail object to persist once you are done with it, I think your best bet is to perform the mailing work in a subprocess (see the subprocess module.) That way, when the subprocess dies, so goes your password.

zdan
  • 28,667
  • 7
  • 60
  • 71
  • 1
    Not unless actively scrubbed within that subprocess, or extremely luckily reallocated by the system to another process and overwritten rapidly enough, …and even then, in some circumstances through nearby memory cell inference — the value would persist and be reachable through things like spectre, heartbleed, and so forth. – amcgregor Aug 14 '20 at 06:13
0

This could be done using numpy chararray:

import numpy as np

username = raw_input('User name: ')
mail = imaplib.IMAP4(MAIL_HOST)
x = np.chararray((20,))
x[:] = list("{:<20}".format(raw_input('Password: ')))
mail.login(username, x.tobytes().strip())
x[:] = ''

You would have to determine the maximum size of password, but this should remove the data when it is overwritten.

heplat
  • 100
  • 2
  • 5
    Unfortunately, you've already lost when raw_input() returns. And again when tobytes() is invoked. You've maybe erased one copy, but not either of those other copies. – Conrad Meyer Dec 21 '18 at 23:42
-4

EDIT: removed the bad advice...

You can also use arrays like the java example if you like, but just overwriting it should be enough.

http://docs.python.org/library/array.html

Trey Stout
  • 6,231
  • 3
  • 24
  • 27
  • 3
    All password = "somethingelse" does is remove the reference to the old password one line earlier. It doesn't actually overwrite anything. – Miles Apr 08 '09 at 02:06
-5

Store the password in a list, and if you just set the list to null, the memory of the array stored in the list is automatically freed.

AlbertoPL
  • 11,479
  • 5
  • 49
  • 73
  • 11
    The level of indirection of storing the string in a list offers zero protection. – Miles Apr 08 '09 at 02:08
  • 3
    Also, there is no specification to clear the memory after being freed. The memory will remain intact and will be vulnerable to being imaged or swapped to disk over time. – drifter Dec 09 '11 at 21:05
  • 1
    There is a nice article on why this doesn't work properly: http://effbot.org/pyfaq/why-doesnt-python-release-the-memory-when-i-delete-a-large-object.htm – mkind Mar 22 '16 at 13:39