1

I am looking into created an MD5 checksum persisted computed value for SQL Server 2008. There is gigabytes of data in this particular table(not my design) and I would like to know if I create the new computed column will this lock the entire table until the computation has finished?

Will the new column update all entries in the table or update them only when they are selected/updated via a SQL command?

What is the recommended practice for created MD5 computed values in SQL Server?

Jeremy
  • 3,880
  • 3
  • 35
  • 42
  • MD5 is no longer considered secure. I recommend using SHA-1 at the least. See http://stackoverflow.com/questions/770900/is-md5-less-secure-than-sha-et-al-in-a-practical-sense – Richard Schneider Jul 04 '11 at 02:34
  • @Richard Schneider: it strongly depends on what md5 is used for. If for fast duplicates lookup or as a control sum - it still perfectly works and isn't worse than sha1 – zerkms Jul 04 '11 at 03:11
  • @zerkms I am using it for quick and dirty duplicate lookup. Basically some back office software has been entering duplicates of <1mb files for 8 years. Essentially there are gigabytes of duplicates, and I want to go all ninja on them. – Jeremy Jul 04 '11 at 03:39

1 Answers1

2

After kicking up an exact copy of the database on a virtual machine I experimented with adding the persisted column.

There were about 1 million rows and it took 2 hours to compute the MD5 hash column. In this time the entire table was locked for select, update and insert. On a production server you would have to factor in other issues.

Jeremy
  • 3,880
  • 3
  • 35
  • 42