I'm running a query over a table variable that holds 22 227 rows. The query used to take 2-3 seconds to complete (which I still think is too slow) but since I added another field to the ORDER BY
clause in DENSE_RANK()
it now completes in 4.5 minutes!
If I include [t2].[aisdt]
with or without [t2].[aiID]
, the execution plan shows that it's scanning 494 039 529 rows, which is 22 227 squared. The following query generates the correct results, just much too slowly to be useful.
SELECT MAX([t].[SetNum]) OVER (PARTITION BY NULL) AS [MaxSet]
,*
FROM (
SELECT DENSE_RANK() OVER (ORDER BY [t2].[aisdt], [t2].[aiID]) AS [SetNum]
,[t2].*
FROM (
SELECT [aiID]
,COUNT(DISTINCT [acID]) AS [noac]
FROM @Temp
GROUP BY [aiID]
) [t1]
JOIN @Temp [t2]
ON [t2].[aiID] = [t1].[aiID]
WHERE [t1].[noac] < [t2].[asm]
) [t]
Just to be clear, the culprit is the bold section in "DENSE_RANK() OVER (ORDER BY [t2].[aisdt], [t2].[aiID])". Removing this field (which needs to remain) drops the execution time back down to 2-3 seconds. I think it might have something to do with JOIN
ing the table to itself on [aiID]
but not [aisdt]
.
How can I speed this query up to complete in the same time as before, or less?
EDIT
Table definition:
DECLARE @Temp TABLE (
[aiID] INT NOT NULL INDEX [IX_Temp_aiID] -- not unique
,[aisdt] DATETIME NOT NULL INDEX [IX_Temp_aisdt] -- not unique
,[asm] INT NOT NULL
,[cpcID] INT NULL
,[cpce] VARCHAR(10) NULL
,[acID] INT NULL
,[ctvID] INT NULL
,[ct] VARCHAR(100) NULL
,[_36_other_non_matched_fields_] VARCHAR(MAX)
,UNIQUE ([aiID], [cpcID], [cpce], [acID], [ctvID], [ct])
)
[aisdt]
is unique per [aiID]
, but there can be multiple [aiID]
s with the same [aisdt]
.
INSERT INTO @TEMP
VALUES (64, '2017-03-23 10:00:00', 1, 17, '', NULL, NULL, NULL, 'blah')
,(64, '2017-03-23 10:00:00', 1, 34, '', NULL, NULL, NULL, 'blah')
,(99, '2017-04-08 09:00:00', 1, 25, 'Y', NULL, NULL, NULL, 'blah')
,(99, '2017-04-08 09:00:00', 1, 16, 'Y', NULL, NULL, NULL, 'blah')
,(99, '2017-04-08 09:00:00', 1, 76, 'Y', NULL, NULL, NULL, 'blah')
,(99, '2017-04-08 09:00:00', 1, 82, 'Y', NULL, NULL, NULL, 'blah')
,(42, '2017-04-14 16:00:00', 2, 32, '', 32, NULL, NULL, 'blah')
,(42, '2017-04-14 16:00:00', 2, 32, '', 47, NULL, NULL, 'blah')
,(42, '2017-04-14 16:00:00', 2, 47, '', 32, NULL, NULL, 'blah')
,(42, '2017-04-14 16:00:00', 2, 47, '', 47, NULL, NULL, 'blah')
,(54, '2017-03-23 10:00:00', 1, 17, '', NULL, NULL, NULL, 'blah')
,(54, '2017-03-23 10:00:00', 1, 34, '', NULL, NULL, NULL, 'blah')
,(89, '2017-04-08 09:00:00', 1, 25, 'Y', NULL, NULL, NULL, 'blah')
,(89, '2017-04-08 09:00:00', 1, 16, 'Y', NULL, NULL, NULL, 'blah')
,(89, '2017-04-08 09:00:00', 1, 76, 'Y', NULL, NULL, NULL, 'blah')
,(89, '2017-04-08 09:00:00', 1, 82, 'Y', NULL, NULL, NULL, 'blah')
,(32, '2017-04-14 16:00:00', 3, 32, '', 32, NULL, NULL, 'blah')
,(32, '2017-04-14 16:00:00', 3, 32, '', 47, NULL, NULL, 'blah')
,(32, '2017-04-14 16:00:00', 3, 47, '', 32, NULL, NULL, 'blah')
,(32, '2017-04-14 16:00:00', 3, 47, '', 47, NULL, NULL, 'blah')
It must be sorted by [aisdt]
(datetime) first, then [aiID]
, then numbered into sets based on [aiID]
.
I want to see:
5, 1, 54, '2017-03-23 10:00:00', 1, 17, '', NULL, NULL, NULL, 'blah'
5, 1, 54, '2017-03-23 10:00:00', 1, 34, '', NULL, NULL, NULL, 'blah'
5, 2, 64, '2017-03-23 10:00:00', 1, 17, '', NULL, NULL, NULL, 'blah'
5, 2, 64, '2017-03-23 10:00:00', 1, 34, '', NULL, NULL, NULL, 'blah'
5, 3, 89, '2017-04-08 09:00:00', 1, 25, 'Y', NULL, NULL, NULL, 'blah'
5, 3, 89, '2017-04-08 09:00:00', 1, 16, 'Y', NULL, NULL, NULL, 'blah'
5, 3, 89, '2017-04-08 09:00:00', 1, 76, 'Y', NULL, NULL, NULL, 'blah'
5, 3, 89, '2017-04-08 09:00:00', 1, 82, 'Y', NULL, NULL, NULL, 'blah'
5, 4, 99, '2017-04-08 09:00:00', 1, 25, 'Y', NULL, NULL, NULL, 'blah'
5, 4, 99, '2017-04-08 09:00:00', 1, 16, 'Y', NULL, NULL, NULL, 'blah'
5, 4, 99, '2017-04-08 09:00:00', 1, 76, 'Y', NULL, NULL, NULL, 'blah'
5, 4, 99, '2017-04-08 09:00:00', 1, 82, 'Y', NULL, NULL, NULL, 'blah'
5, 5, 32, '2017-04-14 16:00:00', 3, 32, '', 32, NULL, NULL, 'blah'
5, 5, 32, '2017-04-14 16:00:00', 3, 32, '', 47, NULL, NULL, 'blah'
5, 5, 32, '2017-04-14 16:00:00', 3, 47, '', 32, NULL, NULL, 'blah'
5, 5, 32, '2017-04-14 16:00:00', 3, 47, '', 47, NULL, NULL, 'blah'