9

I am debating which datatype to use when storing a SHA256 hash in SQL Server. Should it be CHAR(64) or BINARY(32) ... The column will be part of a unique clustered index. I know that I'm probably splitting hairs at this point, however I want to do this right the first time and I know that at times primitive data types are faster and other times the newer more exotic types perform better. ( yes I know char(64) isn't all that new, but it's newer than byte storage )

I've looked around and can't find anything about the performance of one vs. the other in terms of search, etc.

virtualadrian
  • 4,688
  • 4
  • 30
  • 22
  • 2
    Does this help? http://stackoverflow.com/questions/252156/sql-server-2005-what-data-type-to-use-to-store-passwords-hashed-by-sha-256-algo – dcaswell Sep 06 '13 at 23:48
  • 2
    I would use binary(32) simply because there is no reason why I can argue for char(64) - the only reason why I would argue the latter is if all work was done almost exclusively via the CLI (but then conversion functions can be used anyway for human interaction). Programmatically, treating the data raw is just as easy, if not easier. It also results in slightly less space (-> ever so smaller indexes / ever so quicker comparing) and since I can't argue for char(64), might as well just take the free benefit. – user2246674 Sep 06 '13 at 23:52
  • 1
    From what I am seeing binary(32) has a bit of an edge on char(64) and I will only use that column on the SQL side. The other side of the equation as was mentioned here is usability in code, however I'd rather gain that slight edge and just eat the extra dev complexity if any truly exists. So.. I am going to use BINARY(32). http://stackoverflow.com/a/16993150/128795 – virtualadrian Sep 07 '13 at 00:00

1 Answers1

-3

you do know that by using CHAR(64), each row will occupy 64 bits even if your key is "A"? I am not going to discuss the fact that you are using a string as a clustered index, I just assume you have a good reason for that, but using CHAR instead of VARCHAR? Are you planing to update the value? Because that would be the only reason I see to use char instead of varchar

Diego
  • 34,802
  • 21
  • 91
  • 134
  • 2
    A SHA256 hash will always take up 64 characters. Thus CHAR vs VARCHAR. From the comments above I believe BINARY(32) is the way to go. The reason I am using that hash is because I have bits of text that need to be unique ( 2048 in len ) and I can more quickly compare a hash than the string representation and given the low probability of a SHA256 collision I am getting a good performance boost. This table is not highly transactional, or won't be after initial data population, but will be read often. Your thoughts are appreciated so feel free to respond/critique. – virtualadrian Sep 07 '13 at 00:38
  • 1
    @kroolk SHA256 is massively overkill for your purposes then. You could go with a faster noncryptographic hash instead if you don't mind searching around for something non-builtin. (There's a .NET CityHash library on GitHub but it does involve a native DLL: https://github.com/gmarz/CityHash) – millimoose Sep 07 '13 at 01:07
  • 1
    This answer seems to miss or ignore what the OP was actually doing. Suggesting VARCHAR when the OP was asking for a comparison between CHAR and BINARY is sorely out of place. For fixed-length data, fixed-length types will always be superior, and the OP described explicitly fixed-length algorithms. – Daniel Jul 03 '20 at 16:06