11
String secret="foo";
WhatILookFor.securelyWipe(secret);

And I need to know that it will not be removed by java optimizer.

Nerd
  • 321
  • 1
  • 4
  • 9

4 Answers4

9

A String cannot be "wiped". It is immutable, and short of some really dirty and dangerous tricks (using reflection, for example) you cannot alter that.

So the safest solution is to not put the data into a String in the first place. Use a StringBuilder or an array of characters instead, or some other representation that is not immutable. (And then clear it when you are done. And don't use toString() etc on the StringBuilder or you have created the String you were trying to avoid.)

Of course, the "safest" solution is rather impractical. Many Java SE and 3rd party APIs require or return text data to be provided as String objects.


For the record, there are a couple of ways that you can change the contents of a String's backing array. For example, you can use reflection to fish out a reference to the String object's backing array, and overwrite its contents. However, this involves doing things that the JLS states have unspecified behavior so you cannot guarantee that the optimizer won't do something unexpected. And the code to do this will be non-portable, since it depends on internal details of String that have changed over time and may change again.


My personal take on this is that you are better off locking down your application platform so that unauthorized people can't gain access to the memory / memory dump in the first place. After all, if the platform is not properly secured, the "bad guys" may be able to get hold of the string contents before you erase it. Steps like this might be warranted for small amounts of security critical state, but if you've got a lot of "confidential" information to process, it is going to be a major hassle to not be able to use normal strings and string handling.

Stephen C
  • 698,415
  • 94
  • 811
  • 1,216
  • 1
    The JVM will move objects in memory as part of garbage collection, and it won't make any attempt to zero-out the old memory afterward. This means that your "confidential" data will still be readable. – kdgregory Aug 17 '12 at 17:02
  • 1
    @kdgregory - 1) The GC >>does<< zero the old memory. 2) You are missing my point ... secure the platform so that the bad guys can't read the memory whether or not it has been zeroed. Because if they can read it after it has been zeroed, they can probably also read it before it has been zeroed. – Stephen C Aug 18 '12 at 04:07
  • 1
    Re #1: Really? I assume that you can back up this statement with either explicit documentation, or a links to the OpenJDK source that show an explicit clear as part of heap copy/compaction? And have equivalent evidence for other JVMs such as JRockit? Personally, I wouldn't expect the JVM to waste the time, but if you've actually worked with the source, I'll believe you. – kdgregory Aug 18 '12 at 11:04
  • 1
    Re #2: yes, I missed your point. You have three paragraphs that indicate that the OP can indeed clear the string's memory, and one closing paragraph that is "My personal take." To be honest, I stopped after the second paragraph. – kdgregory Aug 18 '12 at 11:07
  • Re No 1. OK. What can I do in case of byte[] instead of string. If I fill it with zeroes, how can I be sure Java compiler will not 'optimize' it out? Re No 3. Of course wiping the confidential data is not a universal protection against bad guys but a step worth doing. But here shall we concentrate only on wiping. – Nerd Aug 21 '12 at 16:05
  • @Nerd - in theory (from the JLS perspective) there is no way to guarantee that some other unrelated thread won't see the old contents of the `byte[]`. In practice, *I suspect* that a writing to an unrelated volatile field after zeroing would be sufficient to trigger a flush of any delayed writes. But whether this would be sufficient to prevent the compiler from optimizing them away is a different matter. In the theory, it isn't. The only way to be sure is to capture the JIT compiled native code and analyse it. Good luck with that! – Stephen C Oct 09 '12 at 09:41
  • 1
    @kdgregory - it is not wasting its time. The semantics of Java require that all heap nodes are default initialized to zero. Given that, it is MORE efficient for the GC to do a large-scale zeroing than it is for the allocator to do a small-scale initialization of nodes as they are allocated. And my understanding is that that is how it works. – Stephen C Oct 09 '12 at 09:44
  • 1
    @StephenC When the GC copies an object to a new location, it doesn't zero the old memory - at least not until that part of the memory is used again for a new allocation. So even if you zero it out, a copy may have been created by the GC before you did at, and for a certain length of time, the old bytes will be readable. The semantics of Java only require zeroing when an object is allocated, which is not the case when the GC copies the object (I know it's an old post - someone linked to it) – Erwin Bolwidt May 26 '15 at 06:16
  • 4
    @ErwinBolwidt -- Well. Recently, I downloaded the Java 8 source code, and I took a look. I was wrong. It appears that (in Java 8 at least) zeroing is deferred until the point at which the JVM attempts to allocate a heap object. – Stephen C May 26 '15 at 10:56
7

You would need direct access to the memory.

You really wouldn't be able to do this with String, since you don't have reliable access to the string, and don't know if it's been interned somewhere, or if an object was created that you don't know about.

If you really needed to this, you'd have to do something like

public class SecureString implements CharSequence {
    char[] data;
    public void wipe() {
       for(int i = 0; i < data.length; i++) data[i] = '.'; // random char
    }
}

That being said, if you're worried about data still being in memory, you have to realize that if it was ever in memory at one point, than an attacker probably already got it. The only thing you realistically protect yourself from is if a core dump is flushed to a log file.

Regarding the optimizer, I incredibly doubt it will optimize away the operation. If you really needed it to, you could do something like this:

public int wipe() {
    // wipe the array to a random value
    java.util.Arrays.fill(data, (char)(rand.nextInt(60000));
    // compute hash to force optimizer to do the wipe
    int hash = 0;
    for(int i = 0; i < data.length; i++) {
        hash = hash * 31 + (int)data[i];
    }
    return hash;
}

This will force the compiler to do the wipe. It makes it roughly twice as long to run, but it's a pretty fast operation as it is, and doesn't increase the order of complexity.

corsiKa
  • 81,495
  • 25
  • 153
  • 204
  • Thank you. But How could it be guaranteed that optimizer will not notice that the result of this computation is not being used so there is no point running it? – Nerd Aug 17 '12 at 15:35
  • I updated the post with a way to ensure the wipe. But I still wouldn't recommend relying on this - rather your best bet is to keep people out of your system memory to start with. – corsiKa Aug 17 '12 at 17:10
4

Store the data off-heap using the "Unsafe" methods. You can then zero over it when done and be certain that it won't be pushed around the heap by the JVM.

Here is a good post on Unsafe:

http://highlyscalable.wordpress.com/2012/02/02/direct-memory-access-in-java/

Dan
  • 1,030
  • 5
  • 12
  • 2
    +1 for the point that the object has to be off-heap. However, if we assume that the attacker can access process memory, then off-heap storage just minimizes the window during which an attach can occur. And there's still the possibility that the OS will swap your pages to disk, leaving the "confidential" data hanging around for a very long time. – kdgregory Aug 17 '12 at 17:01
0

If you're going to use a String then I think you are worried about it appearing in a memory dump. I suggest using String.replace() on key-characters so that when the String is used at run-time it will change and then go out of scope after it is used and won't appear correctly in a memory dump. However, I strongly recommend that you not use a String for sensitive data.

Mitch Connor
  • 766
  • 10
  • 19
  • 5
    Actually, if read the javadoc you'll notice the original memory space will no be overwritten. Instead, a new memory space with the modified string will be returned instead: http://docs.oracle.com/javase/7/docs/api/java/lang/String.html#replace(char,%20char) – asieira Apr 29 '14 at 13:41
  • Well, I am referring to using the String.replace() on a local scope string so it won;t appear in memory as the original String. – Mitch Connor Apr 30 '14 at 23:48
  • Doesn't change things a bit. Remember that in Java what will be stored on the local scope variable is a *reference* to the String object. Also, Strings are immutable. So no operations on class String will ever update an existing string memory space. – asieira May 01 '14 at 17:39
  • Exactly right sir, which is why I am saying form a new string from the old local scope String reference using String.replace(). – Mitch Connor May 03 '14 at 13:56
  • 1
    Agreed, re-reading your text that is clearly what you meant. In particular, the key part of your response is the recommendation to not use String for sensitive data. And I'll add that this is a severe limitation in using Java for security software. – asieira May 06 '14 at 00:36