2

Up until a few days ago, T-SQL code we have running in SQL Azure that generates an MD5 for temp table values was fast (a few seconds per hundred thousand on a Business Critical Gen 5, 2 VCores DB). Recently we noticed from our logs that on multiple client DBs and in our tests there is tremendous variance leading to frequent timeouts. Here is sample code shown generating MD5 and also for SHA2 (though MD5 is our production requirement).

begin transaction;

/********************************/
-- create temp table
/********************************/
CREATE TABLE #tmp (
  Id int PRIMARY KEY,
  UriValue nvarchar(256),
  Md5Value binary(16)
);

/********************************/
-- generate 1 million records into temp table
-- source: https://dba.stackexchange.com/questions/130392/generate-and-insert-1-million-rows-into-simple-table
/********************************/
  WITH
    L0   AS (SELECT c FROM (SELECT 1 UNION ALL SELECT 1) AS D(c)), -- 2^1
    L1   AS (SELECT 1 AS c FROM L0 AS A CROSS JOIN L0 AS B),       -- 2^2
    L2   AS (SELECT 1 AS c FROM L1 AS A CROSS JOIN L1 AS B),       -- 2^4
    L3   AS (SELECT 1 AS c FROM L2 AS A CROSS JOIN L2 AS B),       -- 2^8
    L4   AS (SELECT 1 AS c FROM L3 AS A CROSS JOIN L3 AS B),       -- 2^16
    L5   AS (SELECT 1 AS c FROM L4 AS A CROSS JOIN L4 AS B),       -- 2^32
    Nums AS (SELECT ROW_NUMBER() OVER(ORDER BY (SELECT NULL)) AS k FROM L5
  )
  INSERT INTO #tmp ( Id, UriValue )
  SELECT k as Id, 
         'http://sample.com/' + CONVERT(nvarchar(50), NEWID()) as UriValue
  FROM nums
  WHERE k <= 100000


/********************************/
-- test creation of SHA2 value
/********************************/
  DECLARE @dt datetime;   -- timer
  SET @dt = GETUTCDATE();

  UPDATE #tmp
  SET Md5Value = CAST(HashBytes('SHA2_256', CONVERT(VARCHAR(MAX), LOWER(UriValue)) COLLATE Latin1_General_100_CI_AS_SC_UTF8) as binary(16));
  
  SELECT DATEDIFF(ms, @dt, GETUTCDATE()) AS Sha2ElapsedMs;
  
/********************************/
-- test creation of MD5 value
-- note: our requirement is for a match to http://md5-hash-online.waraxe.us/
/********************************/
  SET @dt = GETUTCDATE();

  UPDATE #tmp
  SET Md5Value = CAST(HashBytes('MD5', CONVERT(VARCHAR(MAX), LOWER(UriValue)) COLLATE Latin1_General_100_CI_AS_SC_UTF8) as binary(16));
  
  SELECT DATEDIFF(ms, @dt, GETUTCDATE()) AS Md5ElapsedMs;
  
rollback transaction;

Note the use of NEWID() to generate a GUID, so I don't think caching is the issue in our tests.

Here are some test runs: Run 1: SHA2 45 seconds, MD5 39 seconds Run 2: SHA2 4 seconds, MD5 3 seconds Run 3: SHA2 34 seconds, MD5 38 seconds Run 4: SHA2 9 seconds, MD5 9 seconds

No other processes are running against the DB during these tests. I see spikes in the Azure portal for CPU (90%) and SQL Server Process (99%) that seem to correlate.

Any idea what might have changed here? Or other ideas for how we can get more predictable performance? Thank you.

Vince
  • 583
  • 5
  • 20
  • Might want to try a memory optimized table instead (https://learn.microsoft.com/en-us/sql/relational-databases/in-memory-oltp/faster-temp-table-and-table-variable-by-using-memory-optimization?view=sql-server-ver15) – Brian May 07 '21 at 19:42
  • The Azure Portal shows it's CPU-bound as opposed to log I/O bound, which is what I always thought MOT helps with most. But maybe, and I'll take a look there too, thanks. – Vince May 07 '21 at 20:10
  • I tested with MOT just now and it's actually slower (though as before, variable). i.e. CREATE TABLE tester ( Id int PRIMARY KEY NONCLUSTERED, UriValue nvarchar(256), Md5Value binary(16) ) WITH (MEMORY_OPTIMIZED = ON, DURABILITY = SCHEMA_AND_DATA); – Vince May 07 '21 at 20:17
  • Variablity in a VM performance though is not unusual is it since you're not exclusively using the bare metal, you're sharing it with other guest operating systems. – Stu May 08 '21 at 18:55

0 Answers0