12

I'm using the chechsum function in sql server 2008 R2 and I would like to get the same int values in a C# app. Is there any equivalent method in c# that returns the values like the sql checksum function? Thanx

Andres
  • 667
  • 1
  • 6
  • 14
  • Similar question here http://stackoverflow.com/questions/6892138/implementing-sql-checksum-in-java – Dan Sep 18 '13 at 16:57
  • Please see my answer, I found the algorithm on SQL Server Forums. It's a series of 4 bit left rotational xor operations. – Dan Sep 18 '13 at 17:12
  • 1
    According to Microsoft, "You cannot call this function directly", however it can "appear within a LINQ to Entities query" within a C# app, [link] (http://msdn.microsoft.com/en-us/library/dd466233(v=vs.100).aspx) – JM Hicks Feb 04 '14 at 10:54

6 Answers6

15

On SQL Server Forum, at this page, it's stated:

The built-in CHECKUM function in SQL Server is built on a series of 4 bit left rotational xor operations. See this post for more explanation.

I was able to port the BINARY_CHECKSUM to c# and it seems to be working... I'll be looking at the plain CHECKSUM later...

private int SQLBinaryChecksum(string text)
{
    long sum = 0;
    byte overflow;
    for (int i = 0; i < text.Length; i++)
    {
        sum = (long)((16 * sum) ^ Convert.ToUInt32(text[i]));
        overflow = (byte)(sum / 4294967296);
        sum = sum - overflow * 4294967296;
        sum = sum ^ overflow;
    }

    if (sum > 2147483647)
        sum = sum - 4294967296;
    else if (sum >= 32768 && sum <= 65535)
        sum = sum - 65536;
    else if (sum >= 128 && sum <= 255)
        sum = sum - 256;

    return (int)sum;
}
Dan
  • 1,060
  • 13
  • 39
  • This is fantastic and seems to work great with text. However, I have a varbinary(max) that stores files and I changed this function to take a byte array instead of a string. It works fine for text files but doesn't return the same results for binary files. Any ideas? – codeConcussion May 31 '17 at 14:24
  • 1
    how to implement it to work with few string like CHECKSUM(col1, col2, col3) ? – mr R Jan 31 '19 at 15:14
  • Be careful with this solution. If BINARY_CHECKSUM() is reading from a variable width column then it only reads the first 255 characters. Of course you can simulate that in the above code, but your code would have to be aware of this whenever it is trying to match SQL's values (https://learn.microsoft.com/en-us/sql/t-sql/functions/binary-checksum-transact-sql). – Agendum Oct 14 '19 at 18:35
2

The T-SQL documentation does not specify what algorithm is used by checksum() outside of this:

CHECKSUM computes a hash value, called the checksum, over its list of arguments. The hash value is intended for use in building hash indexes. If the arguments to CHECKSUM are columns, and an index is built over the computed CHECKSUM value, the result is a hash index. This can be used for equality searches over the columns.

It's unlikely to compute an MD5 hash, since its return value (the computed hash) is a 32-bit integer; an MD5 hash is 128 bits in length.

Nicholas Carey
  • 71,308
  • 16
  • 93
  • 135
  • 2
    Actually SQLServer returns an int for the checksum function. MD5 hashing is done through HASHBYTES function and returns a varbinary (wich it should be the return value of your C# function). I'm looking for the C# functions that also returns an int. – Andres Sep 18 '13 at 16:57
  • You're right, I double read the MSDN page and it doesn't says anything about it... It's suggested to use HASHBYTES instead which implements various hashing algorithms. – Dan Sep 18 '13 at 16:57
2

CHECKSUM docs don't disclose how it computes the hash. If you want a hash you can use in T-SQL and C#, pick from the algorithms supported in HashBytes

Matt Stephenson
  • 8,442
  • 1
  • 19
  • 19
  • 1
    That's too bad, I was looking for an efficient way to hash a string to an int (despite the collisions). – Andres Sep 18 '13 at 16:58
  • @Andres - String.GetHashCode() will hash a string to an int, and you could expose it from a SQLCLR function to use in your database – Matt Stephenson Sep 20 '13 at 13:39
2

In case you need to do a checksum on a GUID, change dna2's answer to this:

private int SQLBinaryChecksum(byte[] text)

With a byte array, the value from SQL will match the value from C#. To test:

var a = Guid.Parse("DEAA5789-6B51-4EED-B370-36F347A0E8E4").ToByteArray();
Console.WriteLine(SQLBinaryChecksum(a));

vs SQL:

select BINARY_CHECKSUM(CONVERT(uniqueidentifier,'DEAA5789-6B51-4EED-B370-36F347A0E8E4'))

both answers will be -1897092103.

2

@Dan's implementation of BinaryChecksum can be greatly simplified down in c# down to

int SqlBinaryChecksum(string text)
{
    uint accumulator = 0;
    for (int i = 0; i < text.Length; i++)
    {
        var leftRotate4bit = (accumulator << 4) | (accumulator >> -4);
        accumulator = leftRotate4bit ^ text[i];
    }
    return (int)accumulator;
}

This also makes it clearer what the algorithm is doing. For each character, a 4 bit circular shift then an xor with character's byte

Alex
  • 3,245
  • 2
  • 19
  • 17
  • This solution is definitely an improvement over @Dan's. In my tests, both returned the same results. Please, note that ReSharper throws a warning: "Suspicious shift count -4 for left operand of type uint: actual shift count will be 28". Also, if you use int for accumulator, no conversion is needed upon return. – Mike O. Jun 30 '22 at 20:54
1

Based on other answers and comments, I made a version with some fixes for:

  • null entry
  • trailing spaces
  • length > 255
  • unchecked conversion

C# >= 11 (note the Unsigned right-shift operator >>>)

[DbFunction("BINARY_CHECKSUM", IsBuiltIn = true, IsNullable = false)] // server-evaluation BINARY_CHECKSUM(text)
public static int BinaryChecksum(string text)
{
    // client-evaluation implementation (same result as BINARY_CHECKSUM(text))
    if (text is null)
    {
        return int.MaxValue;
    }

    var result = 0;
    foreach (var c in text.TrimEnd(' ', ' ').Take(255))
    {
        var circularShift4BitsToLeft = (result << 4) | (result >>> 28);
        result = circularShift4BitsToLeft ^ c;
    }

    return result;
}

C# < 11

[DbFunction("BINARY_CHECKSUM", IsBuiltIn = true, IsNullable = false)] // server-evaluation BINARY_CHECKSUM(text)
public static int BinaryChecksum(string text)
{
    // client-evaluation implementation (same result as BINARY_CHECKSUM(text))
    if (text is null)
    {
        return int.MaxValue;
    }

    var result = 0u;
    foreach (var c in text.TrimEnd(' ', ' ').Take(255))
    {
        var circularShift4BitsToLeft = (result << 4) | (result >> 28);
        result = circularShift4BitsToLeft ^ c;
    }

    return unchecked((int)result);
}
upcu
  • 11
  • 3