I've experienced a T-SQL query taking a significant amount of time. If I split it up into 2 queries, it's faster, but I don't know why.
The query is meant to fetch data from tables C, G, Y and insert it into X - but only if it doesn't already exist in X.
Here's the slow query. It slows down significant when the X-table grows:
INSERT INTO X (COLUMN_A, COLUMN_B, COLUMN_C)
(SELECT a1.COLUMN_A, a1.COLUMN_B, 'QWE' as COLUMN_C
FROM
(SELECT
CAST(G.ID AS VARCHAR(900)) AS COLUMN_A,
MAX(Y.B) AS COLUMN_B
FROM
C
JOIN
G ON (G.ID = C.G_UID)
JOIN
Y ON Y.B COLLATE Danish_Norwegian_BIN = C.UID COLLATE Danish_Norwegian_BIN
GROUP BY
CAST(G.ID AS VARCHAR(900))) AS a1
WHERE
a1.COLUMN_A NOT IN (SELECT COLUMN_A FROM X))
BUT! It's faster if I split the query up (in code) into 2 sets:
Set 1:
SELECT
a1.COLUMN_A, a1.COLUMN_B, 'QWE' as COLUMN_C
FROM
(SELECT
CAST(G.ID AS VARCHAR(900)) AS COLUMN_A, MAX(Y.B) AS COLUMN_B
FROM
C
JOIN
G ON (G.ID = C.G_UID)
JOIN
Y ON Y.B COLLATE Danish_Norwegian_BIN = C.UID COLLATE Danish_Norwegian_BIN
GROUP BY
CAST(G.ID AS VARCHAR(900))) AS a1
Set 2:
SELECT COLUMN_A FROM X
And then manually subtract the sets from each other, and inserting the result into the X-table.
Why is that? Can my slow query be written better?