46

Assuming that I have the following T-SQL code:

SELECT * FROM Foo f
INNER JOIN Bar b ON b.BarId = f.BarId;
WHERE b.IsApproved = 1;

The following one also returns the same set of rows:

SELECT * FROM Foo f
INNER JOIN Bar b ON (b.IsApproved = 1) AND (b.BarId = f.BarId);

This might not be the best case sample here but is there any performance difference between these two?

aF.
  • 64,980
  • 43
  • 135
  • 198
tugberk
  • 57,477
  • 67
  • 243
  • 335
  • 2
    Here's a similar question: http://stackoverflow.com/questions/2509987/which-sql-query-is-faster-filter-on-join-criteria-or-where-clause – Mario S Apr 24 '12 at 11:52
  • 11
    The machine will figure it out and optimize it properly. However, for the humans that will need to debug\modify\support your code years from now, keep the filtering conditions in the `WHERE` and join conditions in the `ON`. – KM. Apr 24 '12 at 11:58
  • @KM. I don't always know how to tell the difference between what's a join condition and a what's a filter. For example [in this answer](http://stackoverflow.com/a/9303069/119477) I think its better in the join so is that a "Join condition" then? [Here's another example](http://stackoverflow.com/a/6473403/119477) which I don't even know how to rewrite the equivalent where clause. – Conrad Frix Apr 24 '12 at 17:55
  • 3
    a join condition is: `tableA.column = tableB.column` a filter condition is `tableA.Column=5`. When doing outer joins (LEFT/RIGHT) you must put the filter conditions within the `ON` or code your `WHERE` in this manner `(tableA.Column=5 OR tableA.Column IS NULL)` – KM. Apr 24 '12 at 18:07

5 Answers5

46

Just be careful of the difference with outer joins. A query where a filter of b.IsApproved (on the right table, Bar) is added to the ON condition of the JOIN:

SELECT * 
FROM Foo f 
LEFT OUTER JOIN Bar b ON (b.IsApproved = 1) AND (b.BarId = f.BarId); 

Is NOT the same as placing the filter in the WHERE clause:

SELECT * 
FROM Foo f 
LEFT OUTER JOIN Bar b ON (b.BarId = f.BarId)
WHERE (b.IsApproved = 1); 

Since for 'failed' outer joins to Bar (i.e. where there is no b.BarId for a f.BarId), this will leave b.IsApproved as NULL for all such failed join rows, and these rows will then be filtered out.

Another way of looking at this is that for the first query, LEFT OUTER JOIN Bar b ON (b.IsApproved = 1) AND (b.BarId = f.BarId) will always return the LEFT table rows, since LEFT OUTER JOIN guarantees the LEFT table rows will be returned even if the join fails. However, the effect of adding (b.IsApproved = 1) to the LEFT OUTER JOIN on condition is to NULL out any right table columns when (b.IsApproved = 1) is false, i.e. as per the same rules normally applied to a LEFT JOIN condition on (b.BarId = f.BarId).

Update: To complete the question asked by Conrad, the equivalent LOJ for an OPTIONAL filter would be:

SELECT * 
FROM Foo f 
LEFT OUTER JOIN Bar b ON (b.BarId = f.BarId)
WHERE (b.IsApproved IS NULL OR b.IsApproved = 1);

i.e. The WHERE clause needs to consider both the condition whether the join fails (NULL) and the filter is to be ignored, and where the join succeeds and the filter must be applied. (b.IsApproved or b.BarId could be tested for NULL)

I've put a SqlFiddle together here which demonstrates the differences between the various placements of the b.IsApproved filter relative to the JOIN.

StuartLC
  • 104,537
  • 17
  • 209
  • 285
  • 1
    Very good point. If you put filter criteria testing data from an outer join into the outer join itself, you'll get more rows than you expect because all Foos will be returned regardless of the status or existence of Bar. When filtering is specified separately from joining, the rows from the two tables are first joined, and then the filter removes the entire row from the table where the criteria is not met. – KeithS Apr 24 '12 at 15:11
  • 1
    @nonnb Ok but if you corrected the WHERE Clause on the 2nd query to `WHERE b.IsApproved = 1 or b.BarId is Null` it is the same. Now which one do you do? – Conrad Frix Apr 24 '12 at 16:51
  • 2
    @nonnn um you wouldn't need `OR (b.BarId IS NULL) ` in the left join version only in the WHERE version and you wanted to make it same. – Conrad Frix Apr 24 '12 at 17:20
33

No, the query optimizer is smart enough to choose the same execution plan for both examples.

You can use SHOWPLAN to check the execution plan.


Nevertheless, you should put all join connection on the ON clause and all the restrictions on the WHERE clause.

aF.
  • 64,980
  • 43
  • 135
  • 198
  • 2
    Beat me to it. Although as a matter of preference, I'd go with the JOIN as it's more descriptive. – Ste Apr 24 '12 at 11:52
  • 2
    Thanks! Imagine a situation with 7 or 8 INNER JOINS. Is your answer applicable to those situations as well? – tugberk Apr 24 '12 at 11:52
  • 14
    @Ste IMO, it's actually more confusing to put everything in the `JOIN`. Use `JOIN` to relate to tables in a query. Use `WHERE` to filter results. It's when you mix the two and use **only** one or the other that queries become hard to read. – Yuck Apr 24 '12 at 11:56
  • @Yuck. Fair point and I would agree on the mix becoming unmanageable. – Ste Apr 24 '12 at 13:18
  • 2
    @Ste: I actually generally prefer the mix of JOIN and WHERE provided the purpose of each keyword is enforced in query writing. The JOIN clauses determine how tables are linked into a "wide" result set, and then the WHERE clause determines the filtering of said results. Given that, I feel it's easier to decipher a query with JOINs and a WHERE than it would be to decipher a query with only JOINs, just as it's easier to decipher JOINs than a query with only a WHERE clause defining both join and filter criteria. – KeithS Apr 24 '12 at 15:07
  • @KeithS: I guess I've never really monitored my usage. I shall report back later and start off a chat about it. :) – Ste Apr 24 '12 at 15:10
6
SELECT * FROM Foo f
INNER JOIN Bar b ON b.BarId = f.BarId
WHERE b.IsApproved = 1;

This is the better form to go. It is easy to read and easy to modify. In the business world this is what you would want to go with. As far as performance they are the same though.

Aaron Bertrand
  • 272,866
  • 37
  • 466
  • 490
Landin Martens
  • 3,283
  • 12
  • 43
  • 61
  • In the my current situation, I favor the WHERE clause but couldn't avoid wondering if there is a perf diff. Thanks! – tugberk Apr 24 '12 at 11:54
0

I've seem some cases where the optimizer was not smart enough even on recent versions of MSSQL - and the performance difference was monster.

But this is a exception, most of time SQL Server optimizer will solve the problem and get the right plan.

So mantain the policy of using filters on WHERE clause and optimize when needed.

Fabricio Araujo
  • 3,810
  • 3
  • 28
  • 43
0

I just ran a test of a query against four tables - one primary table with three INNER JOINs and a total of four paramters, and compared the execution plans of both approaches (using the filter criteria in the ON of the JOIN, and then also in the WHERE clause).

The execution plans are exactly the same. I ran this on SQL Server 2008 R2.

Jason L
  • 75
  • 1
  • 6