14

I have a problem where I need to find records that either have a measurement that matches a value, or do not have that measurement at all. I solved that problem with three or four different approaches, using JOINs, using NOT IN and using NOT EXISTS. However, the query ended up being extremely slow every time. I then tried splitting the query in two, and they both run very fast (three seconds). But combining the queries using OR takes more than five minutes.

Reading on SO I tried UNION, which is very fast, but very inconvenient for the script I am using.

So two questions:

  1. Why is UNION so much faster? (Or why is OR so slow)?
  2. Is there any way I can force MSSQL to use a different approach for the OR statement that is fast?
CJ Dennis
  • 4,226
  • 2
  • 40
  • 69
Derk Arts
  • 3,432
  • 2
  • 18
  • 36
  • can you show us the sql you are referring to. – Mike C. Mar 12 '13 at 13:01
  • Perhaps show us the statements you tried in full? – Mr47 Mar 12 '13 at 13:01
  • I was hoping we could do without because I would need to rename all the fields and tables. Is this really specific for my query? From what I found on Google this is quite a common problem. – Derk Arts Mar 12 '13 at 13:03
  • @occulus how did I not find that?? – Derk Arts Mar 12 '13 at 13:04
  • you dont need to rename all of the columns, but if you have a specific question you need to create a SQL fiddle. It is of no benefit to the community or yourself without this information. We know nothing about your schema (indexes and such) – Woot4Moo Mar 12 '13 at 13:09
  • This shouldn't be marked as a duplicate because UNION ALL and UNION have different behaviours – samgak Sep 20 '17 at 23:08

2 Answers2

17

The reason is that using OR in a query will often cause the Query Optimizer to abandon use of index seeks and revert to scans. If you look at the execution plans for your two queries, you'll most likely see scans where you are using the OR and seeks where you are using the UNION. Without seeing your query it's not really possible to give you any ideas on how you might be able to restructure the OR condition. But you may find that inserting the rows into a temporary table and joining on to it may yield a positive result.

Also, it is generally best to use UNION ALL rather than UNION if you want all results, as you remove the cost of row-matching.

Matt Whitfield
  • 6,436
  • 3
  • 29
  • 44
  • 1
    And I can't trick it into using index scans using OR? – Derk Arts Mar 12 '13 at 13:24
  • @saratis - You may well be able to, but without seeing your SQL I can't give you a firm answer, really. – Matt Whitfield Mar 12 '13 at 16:19
  • Thanks a million, Matt for your explanation. Also thanks Matt for your question. Had a similar problem with a query on an ORACLE server, and this helps me a lot. – George Aug 08 '13 at 06:37
5

There is currently no way in SQL Server to force a UNION execution plan if no UNION statement was used. If the only difference between the two parts is the WHERE clause, create a view with the complex query. The UNION query then becomes very simple:

SELECT * FROM dbo.MyView WHERE <cond1>
UNION ALL
SELECT * FROM dbo.MyView WHERE <cond2>

It is important to use UNION ALL in this context when ever possible. If you just use UNION SQL Server has to filter out duplicate rows, which requires an expensive sort operation in most cases.

Sebastian Meine
  • 11,260
  • 29
  • 41