24

In .NET, we have the SecureString class, which is all very well until you come to try and use it, as to (for example) hash the string, you need the plaintext. I've had a go here at writing a function that will hash a SecureString, given a hash function that takes a byte array and outputs a byte array.

private static byte[] HashSecureString(SecureString ss, Func<byte[], byte[]> hash)
{
    // Convert the SecureString to a BSTR
    IntPtr bstr = Marshal.SecureStringToBSTR(ss);

    // BSTR contains the length of the string in bytes in an
    // Int32 stored in the 4 bytes prior to the BSTR pointer
    int length = Marshal.ReadInt32(bstr, -4);

    // Allocate a byte array to copy the string into
    byte[] bytes = new byte[length];

    // Copy the BSTR to the byte array
    Marshal.Copy(bstr, bytes, 0, length);

    // Immediately destroy the BSTR as we don't need it any more
    Marshal.ZeroFreeBSTR(bstr);

    // Hash the byte array
    byte[] hashed = hash(bytes);

    // Destroy the plaintext copy in the byte array
    for (int i = 0; i < length; i++) { bytes[i] = 0; }

    // Return the hash
    return hashed;
}

I believe this will correctly hash the string, and will correctly scrub any copies of the plaintext from memory by the time the function returns, assuming the provided hash function is well behaved and doesn't make any copies of the input that it doesn't scrub itself. Have I missed anything here?

Mark Raymond
  • 906
  • 8
  • 22
  • 1
    Note, that SecureString might be overkill. If an attacker can read your memory, you have 100% lost. – usr Jan 12 '13 at 13:15
  • 1
    @usr SecureString uses Protected Memory, as such only the calling process can decrypt the memory location. This is especially useful if you want to create a minidump upon application crash and send it in to the devs: They get the whole context, stack trace, etc. except for your password – M.Stramm Apr 17 '15 at 09:30
  • @M.Stramm yes, useful for "cold boot" style attacks but not with a running system (which is 99% of the attack surface). An attacker who can read memory can often read keystrokes and data and so on. There are valid use cases. I grant you that. – usr Apr 17 '15 at 12:18
  • @usr There are ways to design against keyloggers (for example have the user click on an on-screen keyboard with randomized layout). `SecureString` is not supposed to make attacks on a running process impossible, only on memory dumps (without a dump of the system memory). Still, even for a running process an attacker would need execute privileges for the process under attack to retrieve the unencrypted string - instead of just read privileges – M.Stramm Apr 17 '15 at 12:35
  • @M.Stramm the attacker can read the chars from the stack as they come in as window messages. Clearly, there *are* ways to design against keyloggers. SecureString has nothing to do with that however. – usr Apr 17 '15 at 13:09
  • @M.Stramm what's also important to realize is that you somehow have to process the password that was entered. Either hash it or send it to some other system. That forces decoding. SecureString therefore only protects against memory dumps which are not an attack scenario that 99.999% of the software cares about. – usr Apr 17 '15 at 13:11

4 Answers4

14

Have I missed anything here?

Yes, you have, a rather fundamental one at that. You cannot scrub the copy of the array left behind when the garbage collector compacts the heap. Marshal.SecureStringToBSTR(ss) is okay because a BSTR is allocated in unmanaged memory so will have a reliable pointer that won't change. In other words, no problem scrubbing that one.

Your byte[] bytes array however contains the copy of the string and is allocated on the GC heap. You make it likely to induce a garbage collection with the hashed[] array. Easily avoided but of course you have little control over other threads in your process allocating memory and inducing a collection. Or for that matter a background GC that was already in progress when your code started running.

The point of SecureString is to never have a cleartext copy of the string in garbage collected memory. Copying it into a managed array violated that guarantee. If you want to make this code secure then you are going to have to write a hash() method that takes the IntPtr and only reads through that pointer.

Beware that if your hash needs to match a hash computed on another machine then you cannot ignore the Encoding that machine would use to turn the string into bytes.

Hans Passant
  • 922,412
  • 146
  • 1,693
  • 2,536
  • Hmm. Wouldn’t it make more sense to implement a stateful hash callback that is initialised and is then fed one byte at a time so that the hash callback itself only needs to deal with managed types? – Konrad Rudolph Jan 12 '13 at 13:04
  • Tricky; the problem is that hashing functions like Rfc2898DeriveBytes.GetBytes, which is what I was planning on using, only take managed types. Would you have any suggestions of an alternative that uses only unmanaged memory? – Mark Raymond Jan 12 '13 at 14:34
  • Regarding encoding - BSTRs are UTF-16, so AFAIK that will be consistent across machines. – Mark Raymond Jan 12 '13 at 14:53
  • With regards to calling `DeriveBytes.GetBytes`, could you not just use the result of Hans/Konrad's hash functions as your parameter to `GetBytes`? I know it is technically double-hashing (which may have its own weaknesses) but at least only the hash is now in a managed `byte[]`... – jimbobmcgee Jan 08 '14 at 22:32
5

There's always the possibility of using the unmanaged CryptoApi or CNG functions. Bear in mind that SecureString was designed with an unmanaged consumer which has full control over memory management in mind.

If you want to stick to C#, you should pin the temporary array to prevent the GC from moving it around before you get a chance to scrub it:

private static byte[] HashSecureString(SecureString input, Func<byte[], byte[]> hash)
{
    var bstr = Marshal.SecureStringToBSTR(input);
    var length = Marshal.ReadInt32(bstr, -4);
    var bytes = new byte[length];

    var bytesPin = GCHandle.Alloc(bytes, GCHandleType.Pinned);
    try {
        Marshal.Copy(bstr, bytes, 0, length);
        Marshal.ZeroFreeBSTR(bstr);

        return hash(bytes);
    } finally {
        for (var i = 0; i < bytes.Length; i++) { 
            bytes[i] = 0; 
        }

        bytesPin.Free();
    }
}
M.Stramm
  • 1,289
  • 15
  • 29
  • 2
    Now I understand your comment, and as far as I can see this solution should indeed work. The only caveat is that the client might be tempted to implement a `hash` function which reuses the `byte[]` argument as the return value (computing an in-place hash). This would obviously fail dramatically. I don’t think this is a very real problem though. – Konrad Rudolph Apr 21 '15 at 13:02
  • 2
    A much more real problem is that the hash function's implementer should not make a copy of that `byte[]` argument. Sadly most of the hashers in the Crypto namespace seem to do exactly that for performance reasons. Oh well, another problem for another question :) – M.Stramm Apr 21 '15 at 14:42
3

As a complement to Hans’ answer here’s a suggestion how to implement the hasher. Hans suggests passing the pointer to the unmanaged string to the hash function but that means that client code (= the hash function) needs to deal with unmanaged memory. That’s not ideal.

On the other hand, you can replace the callback by an instance of the following interface:

interface Hasher {
    void Reinitialize();
    void AddByte(byte b);
    byte[] Result { get; }
}

That way the hasher (although it becomes slightly more complex) can be implemented wholly in managed land without leaking secure information. Your HashSecureString would then look as follows:

private static byte[] HashSecureString(SecureString ss, Hasher hasher) {
    IntPtr bstr = Marshal.SecureStringToBSTR(ss);
    try {
        int length = Marshal.ReadInt32(bstr, -4);

        hasher.Reinitialize();

        for (int i = 0; i < length; i++)
            hasher.AddByte(Marshal.ReadByte(bstr, i));

        return hasher.Result;
    }
    finally {
        Marshal.ZeroFreeBSTR(bstr);
    }
}

Note the finally block to make sure that the unmanaged memory is zeroed, no matter what shenanigans the hasher instance does.

Here’s a simple (and not very useful) Hasher implementation to illustrate the interface:

sealed class SingleByteXor : Hasher {
    private readonly byte[] data = new byte[1];

    public void Reinitialize() {
        data[0] = 0;
    }

    public void AddByte(byte b) {
        data[0] ^= b;
    }

    public byte[] Result {
        get { return data; }
    }
}
Konrad Rudolph
  • 530,221
  • 131
  • 937
  • 1,214
  • 1
    Nitpicking here: This potentially leaks the first key byte unencrypted if the GC moves the byte[] just at the right moment. +1 anyway as using SecureString is a mistake in the first place. One cannot expect too much from such a system. – usr Jan 12 '13 at 13:34
  • @usr True, and deeply unsatisfactory as far as I’m concerned. Unfortunately, I don’t see a way around that and it’s a fundamental problem that also befalls Hans’ suggested code, unless the whole hashing computation is done in unmanaged memory. Maybe that’s actually the only safe way. – Konrad Rudolph Jan 12 '13 at 13:38
  • 1
    @usr / @KonradRudolph - Could this be mitigated with `GCHandle`? e.g. `byte[] bytes; GCHandle handle = GCHandle.Alloc(bytes = new byte[1], GCHandleType.Pinned); handle.Free();` – jimbobmcgee Jan 08 '14 at 22:38
  • @jimbobmcgee Yes but that once again means that the client code (the hasher) has to deal with unmanaged memory, which is what I wanted to avoid in the first place. – Konrad Rudolph Jan 09 '14 at 10:39
  • @KonradRudolph - Have `Hasher` be abstract, with the `GCHandle.Alloc` in a protected constructor, `Free` in a Dispose and supply protected access to the pinned `byte[]`, and document the strong recommendation that Hasher implementors only use this array for transient storage? (In fairness, I'm just bouncing around ideas to try and understand this better -- I still haven't worked out *why* it leaks the first byte yet!!) – jimbobmcgee Jan 09 '14 at 17:18
  • @jimbobmcgee Yes, that’d work. As to why the first byte *may* be leaked: if a garbage collection is triggered right after the first call to `AddByte`, the `data` array held by the hasher may be moved around in memory (that’s how the compaction step of the GC works). As a consequence, the array may be put at some other memory location, and *there* it will be overwritten by subsequent calls to `AddByte` – but the original memory location remains untouched, and could be read by a malicious attacker. – Konrad Rudolph Jan 09 '14 at 17:26
  • @KonradRudolph -- Thaks. I think I get it now. Just to clarify, this is an issue with your trivial `SingleByteXor` implementation and not with the overall concept? – jimbobmcgee Jan 13 '14 at 15:10
  • @jimbobmcgee It’s an issue with my trivial implementation, but also with *any* purely managed solution. – Konrad Rudolph Jan 13 '14 at 15:25
  • @KonradRudolph Nonono, Using `GCHandle.Alloc` does not mean the hasher has to deal with unmanaged memory. It only prevents the GC from moving the pinned object around, so the pinned object can still be used normally (additionally it can be used from unmanaged code, since it's address doesn't change) – M.Stramm Apr 17 '15 at 12:19
  • @jimbobmcgee Creating that abstract Hasher just moves the problem – M.Stramm Apr 17 '15 at 12:21
  • @M.Stramm You manually need to `Free` the memory. That’s what I mean (and what is usually meant) by manually managed memory. That’s admittedly *not* quite the same as “unmanaged” but the distinction is subtle, and not relevant here. – Konrad Rudolph Apr 17 '15 at 17:37
  • @KonradRudolph The distinction is very relevant here. As whether there's a `GCHandle` for the array or not is transparent to the consumer of the object for which the `GCHandle` was allocated. Also `Free` does not free the object, it frees the handle, allowing the GC to move around and collect the object. On the other hand, If the memory was unmanaged, the consumer of the array would have to consume a `byte*` or `IntPtr`, yuck! – M.Stramm Apr 20 '15 at 11:44
  • @M.Stramm See, that’s *why* the distinction is irrelevant: I’m *not* talking about the types the consumer sees — although I agree that this would indeed be even worse, if only marginally — I’m talking about the much more relevant requirement that the user has to take care of managing (in the general, not the .NET specific sense) memory by calling `Free`. That’s much, much worse from an interface design point of view. — So just to end this debate, when I carelessly talked about “unmanaged”, I didn’t mean the .NET sense. – Konrad Rudolph Apr 20 '15 at 11:47
  • @KonradRudolph I fail to see your point. As I have outlined in my answer below, the consumer (just to be clear, when I say consumer I mean the pluggable hash Func) does *not* have to call `Free` or anything else (in my answer, `Free` is called in the finally block). Control over and responsibility for the GCHandle never leaves the HashSecureString method. – M.Stramm Apr 21 '15 at 12:47
2

As a further complement, could you not wrap the logic @KonradRudolph and @HansPassant supplied into a custom Stream implementation?

This would allow you to use the HashAlgorithm.ComputeHash(Stream) method, which would keep the interface managed (although it would be down to you to dispose the stream in good time).

Of course, you are at the mercy of the HashAlgorithm implementation as to how much data ends up in memory at a time (but, of course, that's what the reference source is for!)

Just an idea...

public class SecureStringStream : Stream
{
    public override bool CanRead { get { return true; } }
    public override bool CanWrite { get { return false; } }
    public override bool CanSeek { get { return false; } }

    public override long Position
    {
        get { return _pos; }
        set { throw new NotSupportedException(); }
    }

    public override void Flush() { throw new NotSupportedException(); }
    public override long Seek(long offset, SeekOrigin origin) { throw new NotSupportedException(); }
    public override void SetLength(long value) { throw new NotSupportedException(); }
    public override void Write(byte[] buffer, int offset, int count) { throw new NotSupportedException(); }

    private readonly IntPtr _bstr = IntPtr.Zero;
    private readonly int _length;
    private int _pos;

    public SecureStringStream(SecureString str)
    {
        if (str == null) throw new ArgumentNullException("str");
        _bstr = Marshal.SecureStringToBSTR(str);

        try
        {
            _length = Marshal.ReadInt32(_bstr, -4);
            _pos = 0;
        }
        catch
        {
            if (_bstr != IntPtr.Zero) Marshal.ZeroFreeBSTR(_bstr);
            throw;
        }
    }

    public override long Length { get { return _length; } }

    public override int Read(byte[] buffer, int offset, int count)
    {
        if (buffer == null) throw new ArgumentNullException("buffer");
        if (offset < 0) throw new ArgumentOutOfRangeException("offset");
        if (count < 0) throw new ArgumentOutOfRangeException("count");
        if (offset + count > buffer.Length) throw new ArgumentException("offset + count > buffer");

        if (count > 0 && _pos++ < _length) 
        {
            buffer[offset] = Marshal.ReadByte(_bstr, _pos++);
            return 1;
        }
        else return 0;
    }

    protected override void Dispose(bool disposing)
    {
        try { if (_bstr != IntPtr.Zero) Marshal.ZeroFreeBSTR(_bstr); }
        finally { base.Dispose(disposing); }
    }
}

void RunMe()
{
    using (SecureString s = new SecureString())
    {
        foreach (char c in "jimbobmcgee") s.AppendChar(c);
        s.MakeReadOnly();

        using (SecureStringStream ss = new SecureStringStream(s))
        using (HashAlgorithm h = MD5.Create())
        {
            Console.WriteLine(Convert.ToBase64String(h.ComputeHash(ss)));
        }
    }
}
jimbobmcgee
  • 1,561
  • 11
  • 34
  • 1
    It is the same problem as `HansPassant` bring up, you still get a copy of the string in the buffer. You could have `Read` only return one byte at a time, it is within your rights as a implimeter of `Read` to do so, however you have no guarantee that `ComputeHash(Stream)` will not just store the passed in buffer in to managed memory too. – Scott Chamberlain Jan 08 '14 at 21:48
  • @ScottChamberlain - I like the idea of `Read` only returning 1 byte at a time (and have modified the example to show this). I leave my original comment *"at the mercy of the HashAlgorithm implementation"* in place, which already agrees with your *"no guarantee...not just store"* ;-) – jimbobmcgee Jan 08 '14 at 22:04
  • 1
    Not that important but you have a small glitch in your new example, you don't handle `count == 0`, you could just make your if statement in to `if (count > 0 && _pos++ < _length)` – Scott Chamberlain Jan 08 '14 at 22:08