11

In SQL (MSSQL, Oracle, etc., whatever), when joining tables, what is the gain from adding a filter to the JOIN statement instead of having it in the WHERE clause?

i.e.

SELECT * FROM X INNER JOIN Y ON X.A = Y.A WHERE X.B = 'SOMETHING'

versus

SELECT * FROM X INNER JOIN Y ON X.A = Y.A AND X.B = 'SOMETHING'

I realize that this does not work in all cases, but I've noticed that in some cases there appears to be a performance gain by putting the filter criteria in the JOIN statement. However, since it's a part of the JOIN statement, it can also cause it to behave a little strangely.

Thoughts?

CodeMonkey1313
  • 15,717
  • 17
  • 76
  • 109

6 Answers6

5

I sometimes do this in queries that have a lot of joins because it localises all the information about the join in one part of the query rather than having some in the join condition and some in the where clause.

Rich
  • 3,095
  • 17
  • 17
5

For INNER JOIN queries, the performance characteristics of these filters will depend on many factors - the size of the tables, indexing, the selectivity of the query, and other factors specific to the RDBMS on which the query is executed.

In LEFT and RIGHT OUTER JOIN, the position of the filter matters much more than INNER JOIN, since affects whether it will be applied before (JOIN clause) or after (WHERE clause) the join is carried out.

Ed Harper
  • 21,127
  • 4
  • 54
  • 80
  • 2
    In other words, including the filter in the ON clause of an OUTER JOIN might (will) produce different results from including it in the WHERE clause. – APC Dec 15 '09 at 13:21
  • 1
    Including an equality filter on the `WHERE` clause of the `OUTER JOIN` effectively makes it an `INNER JOIN`. – Quassnoi Dec 15 '09 at 14:05
3

For an INNER JOIN, I would not expect a performance difference, but rather that the same plan would be used whether the filter was in the JOIN...ON clause or the WHERE clause. I personally prefer to use write the join criteria in the JOIN clause and the filtering in the WHERE clause- a sort of way to stick all the "parameters" to the SQL statement in the same place- this isn't necessarily sensible or well-thought-out. Conversely, some people like to have everything in the JOIN clause to keep everything together.

The situation with outer joins is different- there is a significant difference between "a LEFT OUTER JOIN b ON a.a_id=b.a_id AND b.type = 1" and "a LEFT OUTER JOIN b ON a.a_id=b.a_id WHERE b.type=1"- in fact the latter is implicitly forcing an inner join. This would be another reason to put all such conditions in the JOIN clause, for consistency.

araqnid
  • 127,052
  • 24
  • 157
  • 134
2

These syntaxes are synonymous and are optimized to the same thing by most RDBMS.

I usually prefer this syntax:

SELECT  *
FROM    X
INNER JOIN
        Y
ON      X.A = Y.A
WHERE   X.B = 'SOMETHING'

when B is not a part of the logical link between A and B, and this one:

SELECT  *
FROM    X
INNER JOIN
        Y
ON      X.A = Y.A
        AND X.B = 'SOMETHING'

when it is.

Quassnoi
  • 413,100
  • 91
  • 616
  • 614
0

Nothing, except clarity and meaning. Unless you have outer joins.

gbn
  • 422,506
  • 82
  • 585
  • 676
0

As a human (rather than an optimizer) myself, when maintaining a query I would look for a join condition in the JOIN clause and a search condition in the WHERE clause.

Of course, you need to strike a balance between performance issues and code maintenance issues. However, my first priority is good logical code in the first instance then optimize as necessary.

onedaywhen
  • 55,269
  • 12
  • 100
  • 138