4

I have an existing database in SQL Server 2008 that performs user authentication via stored procedure for an existing PHP web application. The web application sends the stored procedure a string, however the stored procedure stores, and checks the value with SQL Checksum (http://msdn.microsoft.com/en-us/library/ms189788.aspx). The Stored Procedure casts the string as NVARCHAR(50), and stores the CHECKSUM as int in the user table.

I am now writing a new Java application on top of the existing database, and I'm writing a custom spring authentication manager. I would like to re-implement the CHECKSUM algorithm in Java so I do not need to call a stored procedure to perform the conversion, however I can not find any documentation on how SQL CHECKSUM works.

I tried the following code, with the guess that it was CRC32, however it fails to return the same value as SQL CHECKSUM:

String pass = "foobar";
CRC32 crc32 = new CRC32();
crc32.update(pass.getBytes("UTF-16")); //This is due to the stored procedure casting as nvarchar
crc32.getValue();

Can anyone point me to the algorithm that SQL CHECKSUM uses so I can re-implement it in Java?

The question also isn't which algorithm provides the best hash for security. Security is outside of the requirements in this particular instance, as we are not prepared to force a system wide password reset. The question is what algorithm is used by T-SQL CHECKSUM, so that it could be re-implemented. This particular use case is for auth, however there is potential for this being necessary in many different applications.

aweigold
  • 6,532
  • 2
  • 32
  • 46
  • Can you not just use some existing Java function and update all the data with the new hash? Perhaps as a new column to preserve the current one (just in case). This may help too http://stackoverflow.com/q/331157/27535 – gbn Jul 31 '11 at 20:59
  • 2
    That may end up having to be a final option. However that would require passwords on existing accounts to be reset, as the original password is not stored in the database. – aweigold Jul 31 '11 at 21:01
  • 2
    @aweigold: +1 for handling passwords in the correct way :-) – Aasmund Eldhuset Jul 31 '11 at 21:05
  • 1
    @aweigold - you should update your passwords anyway, as the checksum function is not cryptographically secure (not even close). The whole point of checksum is that it is fast, which is the exact opposite of what you want in a hash algorithm for security purposes. And don't use md5, either (per your comment to an answer). Go bycrypt or scrypt, and if those really aren't an option go sha1. – Joel Coehoorn Jul 31 '11 at 21:30
  • Are you using the Checksum function to "encrypt" the passwords? Is not really recommended, maybe you should take the chance to go for a more robust algorithm – jasalguero Jul 31 '11 at 21:55
  • A previous contractor set up the database to "encrypt" the passwords via CHECKSUM. It was an awful design choice. I'm just stuck consuming the current database. I would like to transition to SHA1 with a salt, however to keep moving forward in the short term, it would be nice to re-implement CHECKSUM. – aweigold Jul 31 '11 at 21:59
  • The only good news here is that one of the attributes of checksum that makes it insecure actually works in your favor here: because it's a more simplistic hash, it's easier to reverse the "hash" to the wrong value. – Joel Coehoorn Aug 01 '11 at 00:01
  • I'm currently working around the problem by writing a custom spring security password encoder that exercises a stored procedure that just returns the CHECKSUM of a raw password. I really don't like this, but it appears to be my only option other than forcing a system wide password reset, or re-implementing the algorithm. – aweigold Aug 01 '11 at 02:48

2 Answers2

3

On SQL Server Forum, at this page, it's stated:

The built-in CHECKUM function in SQL Server is built on a series of 4 bit left rotational xor operations. See this post for more explanation.

I was able to port the BINARY_CHECKSUM to c# (sorry I don't have a Java compiler at hand) and it seems to be working... I'll be looking at the plain CHECKSUM later...

private int SQLBinaryChecksum(string text)
{
    long sum = 0;
    byte overflow;
    for (int i = 0; i < text.Length; i++)
    {
        sum = (long)((16 * sum) ^ Convert.ToUInt32(text[i]));
        overflow = (byte)(sum / 4294967296);
        sum = sum - overflow * 4294967296;
        sum = sum ^ overflow;
    }

    if (sum > 2147483647)
        sum = sum - 4294967296;
    else if (sum >= 32768 && sum <= 65535)
        sum = sum - 65536;
    else if (sum >= 128 && sum <= 255)
        sum = sum - 256;

    return (int)sum;
}
Dan
  • 1,060
  • 13
  • 39
  • This looks promising. I will check it out when I have time and accept the answer if it works. – aweigold Sep 18 '13 at 21:34
  • 1
    Sorry, been busy, but I tried this out. And it works for BINARY_CHECKSUM. I will see if I can find anything on CHECKSUM later. If you are interested, I have a groovy gist here https://gist.github.com/aweigold/7144091#file-binarychecksum-groovy – aweigold Oct 24 '13 at 20:09
-1

I would suggest you to replace TSQL CHECKSUM by HASBYTES with MD5 algorithm so you easily find MD5 implementation in Java.

One more benefit you will get from HASHBYTES is more stable behaviour. CHECKSUM could generate the same hash for different inputs.

See how to use HASHBYTES with MD5:

HASHBYTES('MD5', @input)

EDIT: Answer to comments regarding MD5

HASHBYTES also support both SHA and SHA1 algorithms so no problems if you won't use MD5

sll
  • 61,540
  • 22
  • 104
  • 156
  • I absolutely agree with this. However I would like to maintain the existing user accounts, as such, this option would require users to reset their passwords. – aweigold Jul 31 '11 at 21:09
  • 1
    Deal lord, no. Do not use md5 for authentication. – Joel Coehoorn Jul 31 '11 at 21:28
  • 1
    This still requires a stored proc or SQL call: OP wants to remove this. And requires updating all existing values – gbn Jul 31 '11 at 22:01
  • @Joel Coehoorn: I've updated my answer, btw MD5 better than CRC32 anyway ;) – sll Aug 01 '11 at 09:18
  • Remove any hint of md5 from your answer and I'll remove my downvote. – Joel Coehoorn Aug 01 '11 at 13:33
  • 1
    The question isn't which algorithm provides the best hash for security. Security is outside of the requirements in this particular instance. The question is what algorithm is used by T-SQL CHECKSUM, so that it could be re-implemented. This particular use case is for auth, however there is potential for this being necessary in many different applications. – aweigold Aug 01 '11 at 14:38