0

I need to calculate a checksum for a resultset in Oracle, SQL Server and C#, because i want to check if they are all equal.

Sample Data:

CREATE TABLE [dbo].[TestTable](
    [ParentArticleNumber] [nvarchar](50) NOT NULL,
    [ArticleNumber] [nvarchar](50) NOT NULL,
    [Amount] [decimal](18, 4) NOT NULL
)
GO
Insert Into TestTable (
    ParentArticleNumber
    , ArticleNumber
    , Amount) 
values 
    ('Art1', 'Art3', 10.5)
    , ('Art1', 'Art2', 20)
    , ('Art4', 'Art2', 22.35)
Go

In C# the fields are of type string and decimal

Sample Query:

Select 
    ArticleNumber
    , Amount
From
    TestTable
Where
    ParentArticleNumber = 'Art1'

Any ideas how to calculate a kind of checksum which gets the same result on each system?

VSDekar
  • 1,741
  • 2
  • 21
  • 36
  • Why do you want to do that? This is a **very** unusual request, that just won't scale to more than a (very) few thousand rows. What is the *actual* problem you are trying to solve? Simply detect changes? Find changed rows? Both databases have their own mechansisms for change tracking that can scale to many millions of rows – Panagiotis Kanavos Mar 21 '16 at 08:29
  • Normally the result set has a size of about 40 rows. It doesn't have to scale to thousand of rows. The actual problem is the very low bandwith i have to check at a runtime of a program if the resultset in DB A is the same as the resultset in DB B. Which is the case in about 98% The Idea of the checksum is the small amount of data i will have to transmit around the globe. – VSDekar Mar 21 '16 at 08:58
  • That's why you need to explain your *actual* issue, not your attempt to solve it. Sounds like you are trying to sync data between ships using SQL Express with HQ data on an Oracle DB over satellite. There are several approaches to this 1) calculate the hash using C# both on the ship and HQ and transmit the hashes between them. Less bandwidth, far higher latency for the round trips. You could calculate the hash only when data changed, and cache the value. 2) Push the data from HQ to the ships anyway. This saves you a round trip and ... – Panagiotis Kanavos Mar 21 '16 at 09:35
  • 3) maintain version numbers instead of hashes, and send only changes up/down since the last sync. Compare the versions uploaded/downloaded to see whether the ship needs to request more changes. There are many sync protocols that address such situations, don't try to roll your own. Eg git uses a variation of this to sync multiple repositories – Panagiotis Kanavos Mar 21 '16 at 09:38

1 Answers1

1

Use a common hash function, for example MD5.

Note: This assumes you want to do a very simple checksum, if you are doing security related hashes there are some other considerations. Specifically: If this input from untrusted users and they should not be able to trigger hash collisions, use a more secure algorithm than MD5.

I will not post examples as this is basically just searching for MD5 + whatever platform you want to do the hash for.

SQL: HashBytes

Generate MD5 hash string with T-SQL

C#: System.Security.Cryptography.MD5.Create().ComputeHash

calculate a MD5 hash from a string

Oracle: DBMS_CRYPTO

How to call Oracle MD5 hash function?

Community
  • 1
  • 1
PMBjornerud
  • 829
  • 1
  • 8
  • 17