12

I have a table in SQL Server 2008 R2 that contain two field (WordHash, Word). This Hash field generated in C# and I need regenerate hash code for Word field in sql server.

But my problem is that generated MD5 hash in sql server and C# are different. I found below code to resolve this problem but still I have same problem.

SQL code:

CONVERT(NVARCHAR(32),HASHBYTES('MD5', 'some word'), 2)

After putting this code block to my query, I saw some wired result! This is my result:
My Query:

SELECT 
    [WordHash],
    convert(nvarchar(32),HASHBYTES('MD5', 'Analytics'),2) AS TestHash,
    convert(nvarchar(32),HASHBYTES('MD5', [Word]),2) AS SqlHash
FROM myTable

Result:

WordHash: A768CAA988605A2846599CF7E2D0C26A
TestHash: A768CAA988605A2846599CF7E2D0C26A
SqlHash F4AFA5FEF805F7F5163EC6402BAF61FF

Note that the 'Analytics' is one of records data in database.
Why TestHash & SqlHash are different while they generated from same code!?

Mojtaba
  • 1,470
  • 4
  • 18
  • 25
  • 4
    If you do `N'some word'` instead of just `'some word'` do the results become the same? – Scott Chamberlain Mar 12 '14 at 13:50
  • This is just a guess, but I suspect that the `'Analytics'` literal and the `[Word]` value are treated with different encodings, which may indeed make a difference in the hash output. One may be unicode while the other is not. – David Mar 12 '14 at 13:51
  • @ScottChamberlain you r absolutely right :). Can you post some link in order to understand why that happens? – kostas ch. Mar 12 '14 at 13:57
  • @ScottChamberlain - This question was asked earlier than the other question that you marked as duplicate. I suppose the other question should be marked duplicate than this one. – Abhishek kumar Jul 18 '16 at 17:26
  • 1
    @Abhishekkumar That is not how this site works, it is not "The oldest one wins" it is "The best one wins". I found the other question and I remembered answering the same question here, I marked it duplicate to point future visitors to the better, newer, question and answer. See the meta discussion "[Should I vote to close a duplicate question, even though it's much newer, and has more up to date answers?](http://meta.stackexchange.com/questions/147643/should-i-vote-to-close-a-duplicate-question-even-though-its-much-newer-and-ha)", the consus was: No, you should close the older one. – Scott Chamberlain Jul 18 '16 at 17:55
  • @ScottChamberlain- Understand now. Thanks – Abhishek kumar Jul 18 '16 at 19:04

1 Answers1

18

The issue is NVARCHAR and VARCHAR get hashed to different values. Both HASHBYTES('MD5', 'Analytics'), and [WordHash] are hashes of VARCHAR values but [Word] is a NVARCHAR.

select HASHBYTES('MD5',  'Analytics'), 'varchar'
union
select HASHBYTES('MD5', N'Analytics'), 'nvarchar'

--outputs
------------------------------------- --------
0xA768CAA988605A2846599CF7E2D0C26A    varchar
0xF4AFA5FEF805F7F5163EC6402BAF61FF    nvarchar

To fix this you must either change [Word] to be VARCHAR or re-compute [WordHash] using NVARCHAR values.

Some useful further reading: Comparing SQL Server HASHBYTES function and .Net hashing

Scott Chamberlain
  • 124,994
  • 33
  • 282
  • 431