JOIN or Correlated subquery with exists clause, which one is better

Question

select * 
from ContactInformation c 
where exists (select * from Department d where d.Id = c.DepartmentId )

select * 
from ContactInformation c 
inner join Department d on c.DepartmentId = d.Id

Both the queries give out the same output, which is good in performance wise join or correlated sub query with exists clause, which one is better.

Edit :-is there alternet way for joins , so as to increase performance:- In the above 2 queries i want info from dept as well as contactinformation tables

score 5 · Accepted Answer · answered Jul 22 '10 at 05:08

Generally, the EXISTS clause because you may need DISTINCT for a JOIN for it to give the expected output. For example, if you have multiple Department rows for a ContactInformation row.

In your example above, the SELECT *:

means different output too so they are not actually equivalent
less chance of a index being used because you are pulling all columns out

Saying that, even with a limited column list, they will give the same plan: until you need DISTINCT... which is why I say "EXISTS"

Edit :-is there alternet way for joins , so as to increase performance:- In the above 2 queries i want info from dept as well as contactinformation tables — hrishi, Jul 24 '10 at 06:02

score 4 · Answer 2 · edited Aug 03 '19 at 11:06

You need to measure and compare - there's no golden rule which one will be better - it depends on too many variables and things in your system.

In SQL Server Management Studio, you could put both queries in a window, choose Include actual execution plan from the Query menu, and then run them together.

You should get a comparison of both their execution plans and a percentage of how much of the time was spent on one or the other query. Most likely, both will be close to 50% in this case. If not - then you know which of the two queries performs better.

You can learn more about SQL Server execution plans (and even download a free e-book) from Simple-Talk - highly recommended.

score 2 · Answer 3 · answered Jul 22 '10 at 13:33

I assume that either you meant to add the DISTINCT keyword to the SELECT clause in your second query (or, less likely, a Department has only one Contact).

First, always start with 'logical' considerations. The EXISTS construct is arguably more intuitive so, all things 'physical' being equal, I'd go with that.

Second, there will be one day when you will need to ports this code, not necessarily to a different SQL product but, say, the same product but with a different optimizer. A decent optimizer should recognise that both are equivalent and come up with the same ideal plan. Consider that, in theory, the EXISTS construct has slightly more potential to short circuit.

Third, test it using a reasonably large data set. If performance isn't acceptable, start looking at the 'physical' considerations (but I suggest you always keep your 'logically-pure' code in comments for the forthcoming day when the perfect optimizer arrives :)

score 1 · Answer 4 · answered Jul 22 '10 at 05:09

Your first query should output Department columns, while the second one should not.

If you're only interested in ContactInformation, these queries are equivalent. You could run them both and examine the query execution plan to see which one runs faster. For example, on MYSQL, where exists is more efficient with nullable columns, while inner join performs better if neither column is nullable.

JOIN or Correlated subquery with exists clause, which one is better

4 Answers4

Linked