41

I'm in a discussion at work over how to secure sensitive information (e.g. passwords) stored in a Java program. Per security requirements, memory containing sensitive information is cleared, e.g. by setting the values of the bytes to all zeroes. The concern is that an attacker can observe the memory associated with the application process, and so we want to limit as much as possible the window of time such sensitive information hangs around. Previously, projects involved C++, so a memset() sufficed.

(Incidentally, the use of memset() has been called into question because some compilers are known to optimize it's use out of the resulting binary based on the assumption that, since the memory is not used later, there is no need to zero it in the first place. This blurb is a disclaimer for those who Google for "memset" and "clear memory", etc).

Now we have on our hands a Java project being pressed against this requirement.

For Java objects, my understanding is that:

  • a nulled reference only changes the value of the reference; the memory on the heap for the object still contains data
  • an immutable object like String would not be able to have it's data modified (or at least not easily, within the confines of a VM with an appropriately enabled security manager)
  • the generational garbage collectors may make copies of objects all over the place (as noted here)

And for primitives, my understanding is that:

  • a primitive-type variable in a local method would get allocated on the stack, and:
  • when you change it's value, you modify it directly in memory (as opposed to using a reference to handle an object on the heap).
  • copies can/would be made "behind the scenes" in some situations, such as passing it as an argument into methods or boxing (auto- or not) creating instances of the wrappers which contain another primitive variable holding the same value.

My coworker claims that Java primitives are immutable and that there is documentation from both the NSA and Oracle regarding the lack of support in Java for this requirement.

My position is that primitives can (at least in some situations) be zeroed by setting the value to zero (or boolean to false), and the memory is cleared that way.

I'm trying to verify if there's language in the JLS or other "official" documentation about the required behavior of JVMs when it comes to memory management with respect to primitives. The closest I could find was a "Secure Coding Guidelines for the Java Programming Language" on Oracle's site which mentions clearing char arrays after use.

I'd quibble over definitions when my coworker called primitives immutable, but I'm pretty sure he meant "memory cannot be appropriately zeroed" - let's not worry about that. We did not discuss whether he meant final variables - from context we were talking in general.

Are there any definitive answers or references on this? I'd appreciate anything that could show me where I'm wrong or confirm that I'm right.

Edit: After further discussion, I've been able to clarify that my coworker was thinking of the primitive wrappers, not the primitives themselves. So we are left with the original problem of how to clear memory securely, preferably of objects. Also, to clarify, the sensitive information is not just passwords, but also things like IP addresses or encryption keys.

Are there any commercial JVMs which offer a feature like priority handling of certain objects? (I imagine this would actually violate the Java spec, but I thought I'd ask just in case I'm wrong.)

Community
  • 1
  • 1
weiji
  • 1,956
  • 1
  • 16
  • 22
  • 1
    Your coworker is wrong, variables of primitive type are absolutely not immutable in Java. – Jesper Jun 24 '11 at 20:28
  • 5
    To correct what you co-worker said, primitive wrappers are immutables. Integer i = 2; created a new Integer every time. but int i = 2; and then i = 3; just update the value on the stack – Amir Raminfar Jun 24 '11 at 20:28
  • AFAIK you cannot accomplish this goal in Java, even for primitives. The primitives will be on the stack, there's no guarantee that their region of the stack gets overwritten. – Spike Gronim Jun 24 '11 at 20:31
  • 3
    @Amir - this borders on extreme pedantry, but the langauge spec dictates that autoboxing of ints and bytes of values -128 through 127 always return the same object http://java.sun.com/docs/books/jls/third_edition/html/conversions.html#5.1.7 . So Integer i = 2 never creates a new Integer. – whaley Jun 24 '11 at 20:37
  • Thanks r.e. the "primitives are not immutable" comments - I agree completely. Let's not focus on that part - my coworker is not well-versed in Java and we're mostly concerned about the memory part. – weiji Jun 24 '11 at 21:06
  • 1
    @weiji, the mentioned guide is written by some incompetent dude, clearing the byte/char[] doesn't mean anything, both as the compiler can decide is no-op and the GC can actually have copied the area. – bestsss Jun 24 '11 at 21:09

6 Answers6

4

Edit: Actually I just had three ideas that may indeed work - for different values of "work" at least.

The first that is more or less documented would be ByteBuffer.allocateDirect! As I understand it allocateDirect allocates the buffer outside the usual java heap so won't be copied around. I can't find any hard guarantees about it not getting copied in all situations though - but for the current Hotspot VM that is actually the case (ie it's allocated in an extra heap) and I assume this will stay that way.

The second one is using the sun.misc.unsafe package - which as the name says has some rather obvious problems but at least that would be pretty much independent of the used VM - either it's supported (and it works) or it's not (and you get linking errors). The problem is, that the code to use that stuff will get horribly complicated pretty fast (alone getting an unsafe variable is non trivial).

The third one would be to allocate a much, much, MUCH larger size than is actually needed, so that the object gets allocated in the old generation heap to begin:

l-XX:PretenureSizeThreshold= that can be set to limit the size of allocations in the young generation. Any allocation larger than this will not be attempted in the young generation and so will be allocated out of the old generation.

Well the drawback of THAT solution is obvious I think (default size seems to be about 64kb).

. .

Anyways here the old answer:

Yep as I see it you pretty much cannot guarantee that the data stored on the heap is 100% removed without leaving a copy (that's even true if you don't want a general solution but one that'll work with say the current Hotspot VM and its default garbage collectors).

As said in your linked post (here), the garbage collector pretty much makes this impossible to guarantee. Actually contrary to what the post says the problem here isn't the generational GC, but the fact that the Hotspot VM (and now we're implementation specific) is using some kind of Stop & Copy gc for its young generation per default.

This means that as soon as a garbage collection happens between storing the password in the char array and zeroing it out you'll get a copy of the data that will be overwritten only as soon as the next GC happens. Note that tenuring an object will have exactly the same effect, but instead of copying it to to-space it's copied to the old generation heap - we end up with a copy of the data in from space that isn't overwritten.

To avoid this problem we'd pretty much need some way to guarantee that either NO garbage collection is happening between storing the password and zeroing it OR that the char array is stored from the get go in the old generation heap. Also note that this relies on the internas of the Hotspot VM which may very well change (actually there are different garbage collectors where many more copies can be generated; iirc the Hotspot VM supports a concurrent GC using a train algorithm). "luckily" it's impossible to guarantee either one of those (afaik every method call/return introduces a safe point!), so you don't even get tempted to try (especially considering that I don't see any way to make sure the JIT doesn't optimize the zeroing out away) ;)

Seems like the only way to guarantee that the data is stored only in one location is to use the JNI for it.

PS: Note that while the above is only true for the Heap, you can't guarantee anything more for the stack (the JIT will likely optimize writes without reads to the stack away, so when you return from the function the data will still be on the stack)

Community
  • 1
  • 1
Voo
  • 29,040
  • 11
  • 82
  • 156
  • @Voo, after inline the safe points can reduce.and you can just use a DirectByteBuffer (i've proposed that once) but again, you can't clean any socket buffers. – bestsss Jun 24 '11 at 21:38
  • about zeroing you can sum the array after inserting not zero but some stuff like System.currentTimeMillis()&0xf (for instance), stupid way but it will trick the compiler for sure. – bestsss Jun 24 '11 at 21:45
  • @bestsss Yeah I had the same idea about DirectByteBuffers. Since you posted your comment way before I had finished my edit, I'd love to link to your answer so that it doesn't look like I "stole" your idea ;-) PS: Good idea about getting rid of the clearing problem. – Voo Jun 24 '11 at 21:51
  • sry, for the multiple edits: DirectBuffer.address() is guaranteed not to change, so the address and can be retained by JNI code. It's ok, I don't claim any orginality (probably was a comment, I rarely write complete answers, just like do now) – bestsss Jun 24 '11 at 21:52
  • @bestsss Yeah I assumed it's guaranteed (Hotspot has an extra heap for it) but I can't find anything about it in the API. So is it also guaranteed in the absence of any actual JNI code? – Voo Jun 24 '11 at 21:55
  • @Voo, wow what you say is actually true, I know how exactly direct buffers are implemented, I know address() is public but part of sun.misc... and yet I failed to realize it's just not truly public and only the publicly documented code to work on the address is JNI. Hmm, that'd be weird to have moveable address(). Need to check more (I tend not to read docs at all) – bestsss Jun 24 '11 at 22:11
2

Weird, never thought of anything like this.

My first idea would be to make a char[100] to store your password in. Put that in there, use it for whatever, and then do a loop to set every char to blank.

The problem is, the password would at some point turn into a String inside of the database driver, which could live in memory for 0 to infinity seconds.

My second idea would be to have all authentication done through some kind of JNI call to C, but that would be really hard if you are trying to use something like JDBC....

bwawok
  • 14,898
  • 7
  • 32
  • 43
  • this question has popped multiple times, i did try to answer it, it's a fruitless and pointless to do it. Tracing back everything to the socket buffer (which you have no control back) and then forth to any intermediate storage and then making sure zeroing will happen (the compiler is free to optimize it) and that the GC will not copy the memory... is well nigh possible. Still wonder who was master of brain who wrote the article at Sun. – bestsss Jun 24 '11 at 21:14
  • The database driver shouldn't be storing parameters as strings; it doesn't scale well. It _might_ do if it's inserting the parameter into a string for transfer, but who in their right mind would do that? The advantage of parameterized calls (i.e., prepared statements) is that you _don't_ do that! Everything else about them – speed, resistance to SQL injection, etc. – follows from that. – Donal Fellows Jun 25 '11 at 20:22
2

Tell your co-workers that this is a hopeless cause. What about the kernel socket buffers, just for a start.

If you cannot prevent unwanted programs from spying on memory on your machine, the passwords are compromised. Period.

bmargulies
  • 97,814
  • 39
  • 186
  • 310
  • 10
    The concern here is risk mitigation - the intent of the requirement is to minimize the window of time that this information is present. Let's not jump to zealous yes-or-no responses. – weiji Jun 27 '11 at 18:16
  • 2
    It's a bad use of effort to try to reduce risk in this particular area. – bmargulies Jun 28 '11 at 00:02
  • See cold boot attack. Someone may have entered a password months ago, then the attacker gets their hands on the running system and reads out the RAM. Better if secrets are gone. I also got passwords out of a web server with Heartbleed back in the day, which would not have happened if they had been wiped after validation. There's still a timing window, but it requires getting *very* lucky because the sole admin wasn't logging in constantly, and exploitation was possible for a few hours to days. As it was, one-time exploitation sufficed to get the admin panel password. – Luc Jun 29 '23 at 08:43
1

Had this topic recently.

From the requirement, that all sensitive data have controlled lifecycle naturally comes the Design Measure: Every object shall be a Destroyable

From this you see immediately, that Java is not designed for security.

Sol.1: What you can do is defining explicit destructors, as you suggested which overwrite used memory as for byte arrays:

public static void destroy(byte[] a)
{
    if (a != null)
    {
        Arrays.fill(a, (byte) 0);
    }
}

Sol.2 With this you soon get to the border of limited access to private class members. There is still a transient solution using Reflection, as for JPasswordField

public static void destroy(javax.swing.JPasswordField pwfd) throws DestroyFailedException
{
    try
    {
        if (pwfd == null)
        {
            return;
        }
        javax.swing.text.Document doc = pwfd.getDocument();
        Segment txt = new Segment();
        doc.getText(0, doc.getLength(), txt); // use the non-String API
        Field fd_array = Segment.class.getDeclaredField("array");
        fd_array.setAccessible(true);
        char[] sga = (char[]) fd_array.get(txt);
        fd_array.setAccessible(false);
        destroy(sga);
    }
    catch (Throwable th)
    {
        th.printStackTrace();
        throw new DestroyFailedException();
    }
}

This way you can clean up any object with sensitive data. Though, this solution is transient, as newer Java Runtimes warn about unauthorized Reflection Operations and future Runtimes, will even forbid them; so an unsecure Java Design will be protected in the name of security ;)

Sol.3: The long term solution will be to replicate parts of the Java Runtime with Destroyable-s or do sensitive parts entirely native modules (JNI), where you have additional platform specific options of memory protection.

So, in a current project I replaced BigInteger operations with native GMP, where you can overwrite memory mangement and assert, that every single byte is cleaned up. Next step would be authentication policies for each function ...

Sam Ginrich
  • 661
  • 6
  • 7
0

Just aside but some of environments the java core security libs use char[] so it can be zeroed. I imagine that you don't get a guarantee tho.

teknopaul
  • 6,505
  • 2
  • 30
  • 24
-1

I have been trying to work out some similar issues with credentials.

Until now, my only answer is "not to use strings at all for secrets". The strings are comfortable to use and store in human terms, but computers can work well with byte arrays. Even the encryption primitives work with byte[].

When you don't need the password anymore, just fill the array with zeroes and don't let the GC to invent new ways to reuse your secrets.

In another thread (Why can't strings be mutable in Java and .NET?) they make an assumption that it is very short sight. That the strings are immutable because of security reasons; what was not devised is that not always the operational problems are the only ones in existence and that security sometimes need some flexibility and/or support to be effective, a support doesn't exist in the native Java.

To complement. How could we read a password without using strings? Well ... be creative and don't use things like the Android EditText with input-type password, that just is not secure enough and requires you to go to strings.

Community
  • 1
  • 1
user2583872
  • 99
  • 1
  • 2