0

i have a Query optimisation issue.

for the context, this query has always been running instantly

but today it took way more time. (3h+)

so i tried to fix it.

The query is Like -->

Select someCols from  A 
 inner join B left join C   
Where A.date = Today 
And (A.col In ( Select Z.colseekedinA from tab Z)    --A.col is the same column for 
                                          -- than below
       OR
     A.col In ( Select X.colseekedinA from tab X)   

     )


    --    PART 1 ---
  Select someCols from  A 
  inner join B left join C    -- takes 1 second 150 lines
  Where A.date = Today 

   -- Part 2 ---
  Select Z.colseekedinA from tab Z
  OR  -- Union                         -- takes 1 seconds 180 lines
 Select X.colseekedinA from tab X

When i join now the two parts with the In, the query becomes incredibly long.

so i optimized it using union instead or OR and exists instead of in but it still takes 3 minutes

i want to get it done again down to 5 seconds.

do you see some query issue ?

thank you

Badr Erraji
  • 141
  • 1
  • 11
  • 1
    You have 3 queries in your snippet, not one. Your question is vague, as it stands. – Armen Michaeli Apr 30 '19 at 09:55
  • What does `OR -- UNION` mean? Where is the clause for the `OR? `A.date = Today ` isn't valid syntax either. `TODAY` isn't a t-SQL keyword (unless `Today` is a column in `A`, `B`, or `C`, in which case it should aliased). – Thom A Apr 30 '19 at 09:57
  • Post the explain plan. Will help us analyse better – BlackSwan Apr 30 '19 at 09:57
  • Besides the points Larnu made, all of your queries with joins are syntactically incorrect as you're missing the `ON` bit of the joins. – HoneyBadger Apr 30 '19 at 09:58
  • 1
    Also, if the query was fast before, what has changed? – HoneyBadger Apr 30 '19 at 10:00
  • @amn yes ... part 1 and part 2 are the two parts of the query ... taking each 1 second but when i join them with the Col from Part 1 in Col from Part 2 – Badr Erraji Apr 30 '19 at 10:42
  • Please see [paste the plan](https://www.brentozar.com/pastetheplan/instructions/) for a way to include an execution plans in your question. – HABO Apr 30 '19 at 13:05

3 Answers3

1

Using Union and Exists

Select someCols 
from  A 
inner join B on a.col = b.col
left join C  on b.col = c.col
Where A.date = Today 
and exists(
 Select Z.colseekedinA from tab Z where Z.colseekedinA = A.col
 Union                        
 Select X.colseekedinA from tab X where x.colseekedinA = A.col )

Also, if possible change below join to Left join.

inner join B on a.col = b.col 
Rima
  • 1,447
  • 1
  • 6
  • 12
  • thanks for your answer, but as i stated i already : i optimized it using union instead of OR and exists instead of in but it still takes 3 minutes. i'll try the left join ! – Badr Erraji Apr 30 '19 at 10:24
  • My apologies! In the Part 2 of the query both the both the tables refering to same column colseekedinA . Then why two queries are created? – Rima Apr 30 '19 at 10:33
1

The exists approach may give spurious results as you will get rows that do not match either condition just if 1 row does match. This might be avoided by using exists within a correlated subquery but it isn't something I have experimented with enough to recommend.

For speed I'd go for a cross apply and specify the parent table within the cross apply expression (correlated subquery to create a derived table). That way the join condition is specified before the data is returned, if the columns in question have indexes on them (i.e. they are primary keys) then the optimiser can work out an efficient plan for this.

Union all is used within the cross apply expression as this prevents a distinct sort within the derived table which is generally heavier in terms of cost than bringing the data itself back (union has to identify all rows anyway including duplications).

Finally if this is still slow then potentially you might want to add an index to the date column in table a. This overcomes the lack of sargability inherent in a date column and means the optimiser can leverage the index rather than scanning all of the rows in the result set and testing whether or not the date equals today.

Select someCols from  A 
 inner join B left join C   
 cross apply (Select Z.colseekedinA from tab Z where a.col=z.colseekedinA
              union all
              Select X.colseekedinA from tab X where a.col=x.colseekedina) d
Where A.date = Today 
  • The only issue is that your query may return duplicates if both X and Z return the same value. – Alex Apr 30 '19 at 11:27
  • 1
    You can use a distinct clause on the outer query, this will sort the minimum number of rows as the filtering will have already taken place by the time the distinct is called. – Dwight Reynoldson Apr 30 '19 at 11:41
0

You code is confused but for the first part

You could try using a select UNION for the inner subquery ( these with OR ) and avoid the IN clause using a inner JOIN

    Select someCols from  A 
    inner join B 
    left join C   
    INNER JOIN (
        Select Z.colseekedinA from tab Z
        UNION                                   
        Select X.colseekedinA from tab X
    ) t  on A.col = t.colseekedinA 
    Where A.date = Today 
ScaisEdge
  • 131,976
  • 10
  • 91
  • 107