1

I have a query that takes 20 minutes to run, even though I have an index for every column in the where clauses, and every column being joined:

SELECT DISTINCT skt.VCDRAWING_REG_NO, skb.NDRAWING_ORG_NO, skb.NDRAWING_ORG_REV_NO, skb.CAPPLY_START_DATE, skb.CAPPLY_END_DATE, skto.*
FROM SPM_ABS_TRANBASE skt
JOIN SPM_ABS_BASE skb
ON skt.NDRAWING_ORG_REV_NO = skb.NDRAWING_ORG_REV_NO
AND skt.NDRAWING_ORG_NO = skb.NDRAWING_ORG_NO
JOIN SPM_ABS_MODEL skm
ON skb.NDRAWING_ORG_REV_NO = skm.NDRAWING_ORG_REV_NO
AND skb.NDRAWING_ORG_NO = skm.NDRAWING_ORG_NO
JOIN SPM_ABS_TRANOPT skto
ON skt.NDRAWING_SYSTEM_NO = skto.NDRAWING_SYSTEM_NO
JOIN ModelImport mi
ON skm.CMODEL = mi.ModelCode
WHERE (skb.CAPPLY_START_DATE <= DATEADD(day, 2, GETDATE()) OR skb.CAPPLY_START_DATE IS NULL)
AND (skb.CAPPLY_END_DATE >= DATEADD(day, -2, GETDATE()) OR skb.CAPPLY_END_DATE IS NULL)

Here is my query plan.

One thing that puzzles me is this: If I add the following WHERE clause, the query returns in about 0.5 seconds:

AND mi.ModelCode = '3FBK5'

Now you're saying, well, duh, of course it gets much faster with that - the thing is, the ModelImport table contains only 351 records. Which means if I were to split up the query above into 351 queries, each with its own where clause for a distinct ModelCode - then I can get 100% of my query results in about 175 seconds, or 2.9 minutes. This is dramatically faster. Which tells me that something in the wide-open query is grossly inefficient, and the query plan is bad.

Here is my query plan with AND mi.ModelCode = '3FBK5' added.

After viewing my query plan, any ideas how I can speed this up?

HerrimanCoder
  • 6,835
  • 24
  • 78
  • 158
  • Are all those indexes on single columns, or are they composite ones on, say, all the columns in a join on? – Shawn Aug 16 '19 at 17:16
  • They are composites in cases where multiple columns are being joined for any given join - and in which a single where clause includes multiple columns. Otherwise they are singles. – HerrimanCoder Aug 16 '19 at 17:21
  • We need CREATE TABLE statements see [how do I generate a CREATE TABLE statement for a given table?](https://stackoverflow.com/questions/21547/in-sql-server-how-do-i-generate-a-create-table-statement-for-a-given-table) .. It's still geussing without those.. – Raymond Nijland Aug 16 '19 at 17:30
  • Also this might be faster when the SQL is rewritten into using `EXISTS(..)` for example as this query is a exploding ("lots" of JOINs adding records) to imploding (DISTINCT removing records) scanning/filtering approach.. See [Why should I provide a Minimal Reproducible Example for a very simple SQL query?](https://meta.stackoverflow.com/questions/333952/why-should-i-provide-a-minimal-reproducible-example-for-a-very-simple-sql-query) – Raymond Nijland Aug 16 '19 at 17:35
  • Where would the EXISTS() be used in the query? Apologies, but there are hundreds of thousands of records, I can't do CREATE statements for all those, and it's by nature of the high volume of the records that contributes to the slowness. – HerrimanCoder Aug 16 '19 at 17:41
  • What is your cost threshold for parallelism and maxdop set to for this server? Can you post the query plan for the `AND mi.ModelCode = '3FBK5'` query? – S3S Aug 16 '19 at 17:58
  • I notice in your query plan that all of the reads are "table scans" rather than index scans. Are you sure the indexes are being used? – Robert Sievers Aug 16 '19 at 19:31
  • scsimon: I edited the original post and posted the query plan including the ModelCode where clause. – HerrimanCoder Aug 17 '19 at 14:47
  • Robert: I cannot explain the table scans. It doesn't make sense. I verified the indexes are there. Why wouldn't they be used? – HerrimanCoder Aug 17 '19 at 14:49

2 Answers2

0

Is it possible that you eliminate some of the joins as you are not selecting anything from those tables or applying any where conditions to those tables, e.g., skm and mi?

Neeraj Agarwal
  • 1,059
  • 6
  • 5
0

Without having the table schema and sizes it's a little hard to give an exact answer, but here are some updates to try.

  • Use group by instead of distinct
  • Don't use * in the select results (particularly with distinct) instead give specific list of columns to return
  • Avoid "or" statements in where clause (maybe use ISNULL instead)

Here is what the query might look like with these updates (though there probably some other columns from skto you would want to add)

SELECT  skt.VCDRAWING_REG_NO, 
        skb.NDRAWING_ORG_NO, 
        skb.NDRAWING_ORG_REV_NO, 
        skb.CAPPLY_START_DATE, 
        skb.CAPPLY_END_DATE, 
        skto.NDRAWING_SYSTEM_NO
FROM    SPM_ABS_TRANBASE skt
        JOIN SPM_ABS_BASE skb ON 
            skt.NDRAWING_ORG_REV_NO = skb.NDRAWING_ORG_REV_NO
            AND skt.NDRAWING_ORG_NO = skb.NDRAWING_ORG_NO
        JOIN SPM_ABS_MODEL skm ON 
            skb.NDRAWING_ORG_REV_NO = skm.NDRAWING_ORG_REV_NO
            AND skb.NDRAWING_ORG_NO = skm.NDRAWING_ORG_NO
        JOIN SPM_ABS_TRANOPT skto ON 
            skt.NDRAWING_SYSTEM_NO = skto.NDRAWING_SYSTEM_NO
        JOIN ModelImport mi ON 
            skm.CMODEL = mi.ModelCode
WHERE   ISNULL(skb.CAPPLY_START_DATE, DATEADD(day, 2, GETDATE())) <= DATEADD(day, 2, GETDATE())
        AND ISNULL(skb.CAPPLY_END_DATE,DATEADD(day, -2, GETDATE())) >= DATEADD(day, -2, GETDATE())
GROUP BY skt.VCDRAWING_REG_NO, 
        skb.NDRAWING_ORG_NO, 
        skb.NDRAWING_ORG_REV_NO, 
        skb.CAPPLY_START_DATE, 
        skb.CAPPLY_END_DATE,
        skto.NDRAWING_SYSTEM_NO
Nate
  • 651
  • 4
  • 7