0

I was confused with the next (simplified) piece of code. While user registration it encrypts password, converts hash to string and saves it in database. After that user tries to login, code reads password from db, get bytes of it, and compare with encrypted hash of password, which user entered.

static void Main(string[] args)
{
    //User registration
    byte[] passwordBytes = Encoding.Unicode.GetBytes("P@ssword");
    byte[] hashBytes = GetHash(passwordBytes);
    string stringFieldInDb = Encoding.Unicode.GetString(hashBytes); //password hash is being stored in database


    //Check password
    byte[] hashBytesInDb = Encoding.Unicode.GetBytes(stringFieldInDb); //was read from database

    byte[] enteredPasswordBytes = Encoding.Unicode.GetBytes("P@ssword");
    byte[] enteredPasswordHash = GetHash(enteredPasswordBytes);

    //is false
    var isPasswordValid = hashBytesInDb.SequenceEqual(enteredPasswordHash);

    //this way is true
    var isPasswordValid2 = stringFieldInDb == Encoding.Unicode.GetString(enteredPasswordHash);
}

private static byte[] GetHash(byte[] data)
{
    return new SHA512CryptoServiceProvider().ComputeHash(data);
}

Hashes are little bit different, bytes of hash string from database:

161, 127, 0, 49, 27, 146, **253, 255**, 109, 214, **253, 255**, 113, 75, 226, ...

Bytes of hash string generated from entered password in login:

161, 127, 0, 49, 27, 146, **74, 219**, 109, 214, **65, 220**, 113, 75, 226, ...

I shortened the above example to three lines, and I wonder what is the reason of that result?

byte[] someCharBytes = new byte[] { 74, 219 };
string someChar = Encoding.Unicode.GetString(someCharBytes);
byte[] differentSomeCharBytes = Encoding.Unicode.GetBytes(someChar); //returns { 253, 255 }
Vasyl Senko
  • 1,779
  • 20
  • 33
  • As a side-note, don't forget to salt your passwords. In fact it's better to just use the built-in `Rfc2898DeriveBytes` class for hashing passwords. Check this out: http://stackoverflow.com/a/10402129/227267 – Matti Virkkunen Mar 05 '16 at 01:42

1 Answers1

4

What you're doing is trying to interpret hash data (essentially random bytes) as valid UTF-16 data. That's just not going to work. Not all byte combinations are valid. The specific bytes 253, 255 you're getting are the UTF-16 representation for U+FFFD REPLACEMENT CHARACTER, which is the character that is used to notify of an invalid byte sequence.

If you need to convert a byte array into a string for storage, base64 encoding is pretty popular. Check out Convert.ToBase64String and Convert.FromBase64String.

Matti Virkkunen
  • 63,558
  • 9
  • 127
  • 159