Up until a few days ago, T-SQL code we have running in SQL Azure that generates an MD5 for temp table values was fast (a few seconds per hundred thousand on a Business Critical Gen 5, 2 VCores DB). Recently we noticed from our logs that on multiple client DBs and in our tests there is tremendous variance leading to frequent timeouts. Here is sample code shown generating MD5 and also for SHA2 (though MD5 is our production requirement).
begin transaction;
/********************************/
-- create temp table
/********************************/
CREATE TABLE #tmp (
Id int PRIMARY KEY,
UriValue nvarchar(256),
Md5Value binary(16)
);
/********************************/
-- generate 1 million records into temp table
-- source: https://dba.stackexchange.com/questions/130392/generate-and-insert-1-million-rows-into-simple-table
/********************************/
WITH
L0 AS (SELECT c FROM (SELECT 1 UNION ALL SELECT 1) AS D(c)), -- 2^1
L1 AS (SELECT 1 AS c FROM L0 AS A CROSS JOIN L0 AS B), -- 2^2
L2 AS (SELECT 1 AS c FROM L1 AS A CROSS JOIN L1 AS B), -- 2^4
L3 AS (SELECT 1 AS c FROM L2 AS A CROSS JOIN L2 AS B), -- 2^8
L4 AS (SELECT 1 AS c FROM L3 AS A CROSS JOIN L3 AS B), -- 2^16
L5 AS (SELECT 1 AS c FROM L4 AS A CROSS JOIN L4 AS B), -- 2^32
Nums AS (SELECT ROW_NUMBER() OVER(ORDER BY (SELECT NULL)) AS k FROM L5
)
INSERT INTO #tmp ( Id, UriValue )
SELECT k as Id,
'http://sample.com/' + CONVERT(nvarchar(50), NEWID()) as UriValue
FROM nums
WHERE k <= 100000
/********************************/
-- test creation of SHA2 value
/********************************/
DECLARE @dt datetime; -- timer
SET @dt = GETUTCDATE();
UPDATE #tmp
SET Md5Value = CAST(HashBytes('SHA2_256', CONVERT(VARCHAR(MAX), LOWER(UriValue)) COLLATE Latin1_General_100_CI_AS_SC_UTF8) as binary(16));
SELECT DATEDIFF(ms, @dt, GETUTCDATE()) AS Sha2ElapsedMs;
/********************************/
-- test creation of MD5 value
-- note: our requirement is for a match to http://md5-hash-online.waraxe.us/
/********************************/
SET @dt = GETUTCDATE();
UPDATE #tmp
SET Md5Value = CAST(HashBytes('MD5', CONVERT(VARCHAR(MAX), LOWER(UriValue)) COLLATE Latin1_General_100_CI_AS_SC_UTF8) as binary(16));
SELECT DATEDIFF(ms, @dt, GETUTCDATE()) AS Md5ElapsedMs;
rollback transaction;
Note the use of NEWID() to generate a GUID, so I don't think caching is the issue in our tests.
Here are some test runs: Run 1: SHA2 45 seconds, MD5 39 seconds Run 2: SHA2 4 seconds, MD5 3 seconds Run 3: SHA2 34 seconds, MD5 38 seconds Run 4: SHA2 9 seconds, MD5 9 seconds
No other processes are running against the DB during these tests. I see spikes in the Azure portal for CPU (90%) and SQL Server Process (99%) that seem to correlate.
Any idea what might have changed here? Or other ideas for how we can get more predictable performance? Thank you.