When using SQL Server’s varchar data type the .Net encoding to go with is UTF8 since it’s the fastest and most optimized of the three (ASCII, UTF7, UTF8). When using the nvarchar data type to go is Unicode (UTF16) but we also have to know the texts endianness to create correct hashes.
Examples:
DotNet:
string source = "Sample string with more than 8000 Characters";
using (MD5 md5Hash = MD5.Create())
{
Console.WriteLine(GetMd5Hash(md5Hash, source, System.Text.Encoding.ASCII)); <br/>Console.WriteLine(GetMd5Hash(md5Hash, source, System.Text.Encoding.UTF7));
Console.WriteLine(GetMd5Hash(md5Hash, source, System.Text.Encoding.UTF8));
Console.WriteLine(GetMd5Hash(md5Hash, source,System.Text.Encoding.Unicode));
Console.WriteLine(GetMd5Hash(md5Hash, source, System.Text.Encoding.UTF32));
}
DotNet Image
Database:
-- note the size 15 of all datatypes
declare @val1 varchar(50),
@val2 nvarchar(50),
@val3 char(50),
@val4 nchar(50)
-- example of 1 byte/char text
-- all variables are of same length
select @val1 = 'Sample string with more than 8000 Characters'
@val2 = N'Sample string with more than 8000 Characters',
@val3 = 'Sample string with more than 8000 Characters',
@val4 = N'Sample string with more than 8000 Characters'
-- all 4 return different results
select HASHBYTES('md5', @val1) as MD5_varchar,
-- result = 0xAE1C585474D90965ED832A7E588D4AF4
-- just to show that collation doesnot change the hash
HASHBYTES('md5', @val1 collate Cyrillic_General_BIN2) as MD5_varchar_collation,
-- result = 0xAE1C585474D90965ED832A7E588D4AF4
HASHBYTES('md5', @val2) as MD5_Nvarchar,
-- result = 0x880632484491D1283818B7A3AE3D2AFC
HASHBYTES('md5', @val3) as MD5_char,
-- result = 0x77D2A2CF22998C4CD5AD5550664BF931
HASHBYTES('md5', @val4) as MD5_Nchar
-- result = 0x9B55DF00EAFE01D764BB944592C27521
Database Image