Since you've mentioned this is for preview purposes, so I'm assuming you just want data out of the query and you want it to run FAST
regardless of the data it returns, and seeing that you mentioned that the query takes 14 minutes to execute, a quick 'hack-fix' would be to use something like below:
SELECT
certs.CertId
, COUNT(cluster.BGTJobId)
FROM
(SELECT TOP 100
certs.CertId
FROM [Eagle].[raw].[certs] certs) certs
INNER JOIN [CentralDB_US_33].[dbo].[JobSkillClusterIndex] cluster
ON certs.BGTJobId = cluster.BGTJobId
GROUP BY cluster.skillClusterId, certs.CertId
Aggregating data (in your case COUNT
) is a very expensive operation and should be done only at the last part of the query on as little data as possible. That is why, for "preview" purposes I have selected onyl the first 100 certificates and made the COUNT
on that data.
However, because you mentioned that the query takes 14 minutes to run, the problems are elsewhere and usually this is due to design (query design, index design or even table design).
You should ask yourself if you really want to go over all of the data in the tables and get all of the matching rows from both tables, and aren't you possibly missing a WHERE
clause?
If you do decide that there is a WHERE
clause needed, are there any indexes to help filtering the data based on the conditions of your WHERE
clause (and even the join columns - certs.BGTJobId
and cluster.BGTJobId
?