Two queries provide the same result.
How to define the query with better performance:
a. before execution
b. after execution
?
Example 1: following queries return difference between two heaps
( SELECT * FROM TABLE1 EXCEPT
SELECT * FROM TABLE2
)
UNION ALL
( SELECT * FROM TABLE2 EXCEPT
SELECT * FROM TABLE1
)
;
2nd approach (and 3rd with UNION ALL
):
( SELECT * FROM TABLE1 UNION --ALL --?!
SELECT * FROM TABLE2
)
EXCEPT
( SELECT * FROM TABLE2 INTERSECT
SELECT * FROM TABLE1
)
;
Which approach is better?
Or it depends (Advantages/Disadvantages)?
Example 2: Almost the same, but in addition returns source table column
SELECT 'TABLE1-ONLY' AS SRC, T1.*
FROM (
SELECT * from TABLE1
EXCEPT
SELECT * from TABLE2
) AS T1
UNION ALL
SELECT 'TABLE2-ONLY' AS SRC, T2.*
FROM (
SELECT * from TABLE2
EXCEPT
SELECT * from TABLE1
) AS T2
;
2nd approach (and 3rd with UNION ALL
):
( SELECT SRC='TABLE1-ONLY', * FROM TABLE1 UNION --ALL --?!
SELECT SRC='TABLE2-ONLY', * FROM TABLE2
)
EXCEPT
( SELECT * FROM (select SRC='TABLE1-ONLY' UNION ALL
select SRC='TABLE2-ONLY'
) s
,(SELECT * FROM TABLE2 INTERSECT
SELECT * FROM TABLE1
) i
);
Which approach is better?
Or it depends (Advantages/Disadvantages)?
NB! Following questions are out of this topic:
- How to define Primary Key for the table
- How to identify duplicates in the table
P.S.
- The PRIMARY question is about performance comparison.
- Solution improvement is secondary.
set statistics io on;
are almost the same.set statistics time on;
could provide different results after each run.- Don't see a big difference in execution plans (but not professional in this field).
- I have a feeling of queries performance difference, but not knowledge. If someone has knowledge please share. ;)
- Could you please specify scenarios when one solution is definitely better than the other one and vice versa?
- Regarding 2nd vs 3rd approach: where duplicate removal works faster - within
UNION
or withinEXCEPT
/INTERSECT
? And if duplicates are removed duringUNION
, willEXCEPT
/INTERSECT
search for it again?