36

In SQL I am trying to filter results based on an ID and wondering if there is any logical difference between

SELECT value 
FROM table1 
JOIN table2 ON table1.id = table2.id 
WHERE table1.id = 1

and

SELECT value 
FROM table1 
JOIN table2 ON table1.id = table2.id AND table1.id = 1

To me, it seems as if the logic is different though you will always get the same set of results but I wondered if there were any conditions under which you would get two different result sets (or would they always return the exact same two result sets)

davejal
  • 6,009
  • 10
  • 39
  • 82
warsong
  • 1,008
  • 2
  • 12
  • 21
  • 4
    Simple **NO** difference – Pரதீப் Dec 24 '15 at 15:26
  • 11
    Try left joins, then you'll start seeing differences. – sstan Dec 24 '15 at 15:27
  • 3
    Have you tried a showplan to see the path taken for both? – scrappedcola Dec 24 '15 at 15:27
  • 5
    Possible duplicate of [Which SQL query is faster? Filter on Join criteria or Where clause?](http://stackoverflow.com/questions/2509987/which-sql-query-is-faster-filter-on-join-criteria-or-where-clause) – HoneyBadger Dec 24 '15 at 15:31
  • 4
    @HoneyBadger it's not really a duplicate though. that's about performance, this is about the logic. – Drew Kennedy Dec 24 '15 at 15:32
  • @Drew: Though the question is slightly different, the answer is perfectly applicable here. – sstan Dec 24 '15 at 15:33
  • 1
    Also, that duplicate is just the first I found, there are many, many more... – HoneyBadger Dec 24 '15 at 15:34
  • 2
    On an inner join the results will be the same. Using outer joins the results will vary. When additional critiera is placed on the join, it is applied BEFORE The join occurs. Thus in the case of `FROM A LEFT JOIN B On A.ID = B.ID and B.Value = 'Blue'` the left join is kept as the null values that would normally exist will still exist; but move B.Value='Blue to the where clause and now your LEFT join has basically become an inner join, as the NULL values will not equal 'blue' so they will be eliminated. – xQbert Dec 24 '15 at 15:36
  • 1
    @user3219946 is your question about this specific example or all joins in general? For this result sets will always be the same, but that is not true for outer joins. – xQbert Dec 24 '15 at 15:43
  • 1
    If you find it more *logical* to think of the join as including the extra condition it might make sense to move it from the `where`. Probably not going to be much consensus on that stylistically though. – shawnt00 Dec 24 '15 at 15:49
  • 1
    Inner joins: No difference. Other joins: There can be a difference – Mitch Wheat Apr 29 '18 at 10:53
  • `x inner join y on c` is defined in the SQL standard to be `x, y where c` – philipxy Apr 13 '20 at 09:08
  • Does this answer your question? [INNER JOIN ON vs WHERE clause](https://stackoverflow.com/questions/1018822/inner-join-on-vs-where-clause) – philipxy Apr 13 '20 at 19:54
  • My 1st comment is for other people who land here (sorry to disturb you). (Observe that the accepted answer doesn't justify "no difference" or explain what its terms mean.) The 2nd was automatically generated when I voted this as a duplicate (which is also for others but will get moved to a system-generated preface on closure as a duplicate). – philipxy Apr 13 '20 at 21:37

3 Answers3

52

The answer is NO difference, but:

I will always prefer to do the following.

  • Always keep the Join Conditions in ON clause
  • Always put the filter's in where clause

This makes the query more readable.

So I will use this query:

SELECT value
FROM table1
INNER JOIN table2
        ON table1.id = table2.id
WHERE table1.id = 1

However when you are using OUTER JOIN'S there is a big difference in keeping the filter in the ON condition and Where condition.

Logical Query Processing

The following list contains a general form of a query, along with step numbers assigned according to the order in which the different clauses are logically processed.

(5) SELECT (5-2) DISTINCT (5-3) TOP(<top_specification>) (5-1) <select_list>
(1) FROM (1-J) <left_table> <join_type> JOIN <right_table> ON <on_predicate>
| (1-A) <left_table> <apply_type> APPLY <right_table_expression> AS <alias>
| (1-P) <left_table> PIVOT(<pivot_specification>) AS <alias>
| (1-U) <left_table> UNPIVOT(<unpivot_specification>) AS <alias>
(2) WHERE <where_predicate>
(3) GROUP BY <group_by_specification>
(4) HAVING <having_predicate>
(6) ORDER BY <order_by_list>;

Flow diagram logical query processing

Enter image description here

  • (1) FROM: The FROM phase identifies the query’s source tables and processes table operators. Each table operator applies a series of sub phases. For example, the phases involved in a join are (1-J1) Cartesian product, (1-J2) ON Filter, (1-J3) Add Outer Rows. The FROM phase generates virtual table VT1.

  • (1-J1) Cartesian Product: This phase performs a Cartesian product (cross join) between the two tables involved in the table operator, generating VT1-J1.

  • (1-J2) ON Filter: This phase filters the rows from VT1-J1 based on the predicate that appears in the ON clause (<on_predicate>). Only rows for which the predicate evaluates to TRUE are inserted into VT1-J2.
  • (1-J3) Add Outer Rows: If OUTER JOIN is specified (as opposed to CROSS JOIN or INNER JOIN), rows from the preserved table or tables for which a match was not found are added to the rows from VT1-J2 as outer rows, generating VT1-J3.
  • (2) WHERE: This phase filters the rows from VT1 based on the predicate that appears in the WHERE clause (). Only rows for which the predicate evaluates to TRUE are inserted into VT2.
  • (3) GROUP BY: This phase arranges the rows from VT2 in groups based on the column list specified in the GROUP BY clause, generating VT3. Ultimately, there will be one result row per group.
  • (4) HAVING: This phase filters the groups from VT3 based on the predicate that appears in the HAVING clause (<having_predicate>). Only groups for which the predicate evaluates to TRUE are inserted into VT4.
  • (5) SELECT: This phase processes the elements in the SELECT clause, generating VT5.
  • (5-1) Evaluate Expressions: This phase evaluates the expressions in the SELECT list, generating VT5-1.
  • (5-2) DISTINCT: This phase removes duplicate rows from VT5-1, generating VT5-2.
  • (5-3) TOP: This phase filters the specified top number or percentage of rows from VT5-2 based on the logical ordering defined by the ORDER BY clause, generating the table VT5-3.
  • (6) ORDER BY: This phase sorts the rows from VT5-3 according to the column list specified in the ORDER BY clause, generating the cursor VC6.

it is referred from book "T-SQL Querying (Developer Reference)"

Pரதீப்
  • 91,748
  • 19
  • 131
  • 172
  • 2
    Same and agreed! Also, be _specific_ with your joins. Ie. use `INNER` or `LEFT-or-RIGHT OUTER` when defining a join, IMO. – Pure.Krome Dec 24 '15 at 15:32
  • 2
    While I agree with this SPECIFIC question. OUTER joins have a vast difference in where limiting criteria is placed when the criteria is on the limited matching record set. Filter can go in the on clause and must in some cases when using outer joins to achieve the desired results. or you end up with where clauses (unlike santa clauses) like where `(val='blue' or val is null)` which can also produce incorrect results at times especially if NULL has specific meaning beyond that generated in the join. – xQbert Dec 24 '15 at 15:39
  • 4
    While on the topic of **readability**: I'd also recommend to be **consistent** with your keywords - either put them all in UPPERCASE, or all in lowercase, or all in a mixed case - I don't care which - but just be **CONSISTENT** - don't use one style (`SELECT`, `FROM`) mixed in with a second style ( `on`, `where`) in the same query...... makes it **really** hard to read that T-SQL code.... Pick one style - whichever you like - but then **STICK** to it and be consistent – marc_s Dec 24 '15 at 15:46
  • Interestingly the pre-ANSI-92 SQL syntax **for inner join** provides no space for the above discussions... – Marmite Bomber Dec 24 '15 at 15:57
  • @StanShaw - Seems like someone found it is not useful :) – Pரதீப் Feb 16 '16 at 16:29
  • @VR46 unless it's Bill Gates I call BS – Stan Shaw Feb 16 '16 at 17:11
  • Note that the above diagram and the step descriptions come from Itzik Ben-Gan's **excellent** T-SQL Querying book [link] (https://www.amazon.com/T-SQL-Querying-Developer-Reference-Ben-Gan-ebook/dp/B00TPRWVHY/ref=sr_1_3?crid=2XI02HK3OH3RA&dchild=1&keywords=itzik+ben-gan&qid=1586558608&sprefix=itzik+be%2Caps%2C207&sr=8-3)- verbatim. Thanga referred to a blog for this but the poster of that blog never credited the source. That blog doesn't even have author info. Shady if you ask me. – Concerned_Citizen Apr 10 '20 at 22:48
  • @Concerned_Citizen - Apologies, I got the link from google search while I started learning Sql Server. Never know that, it is copied from somewhere else. I'll update the link. – Pரதீப் Apr 13 '20 at 05:43
  • Please: Put everthing that you are quoting in a block quote, give credit to the author in text not just al link to elsewhere & cut out the old unnecessary content, don't strike out. – philipxy Apr 13 '20 at 09:14
22

While there is no difference when using INNER JOINS, as VR46 pointed out, there is a significant difference when using OUTER JOINS and evaluating a value in the second table (for left joins - first table for right joins). Consider the following setup:

DECLARE @Table1 TABLE ([ID] int)
DECLARE @Table2 TABLE ([Table1ID] int, [Value] varchar(50))

INSERT INTO @Table1
VALUES
(1),
(2),
(3)

INSERT INTO @Table2
VALUES
(1, 'test'),
(1, 'hello'),
(2, 'goodbye')

If we select from it using a left outer join and put a condition in the WHERE clause:

SELECT * FROM @Table1 T1
LEFT OUTER JOIN @Table2 T2
    ON T1.ID = T2.Table1ID
WHERE T2.Table1ID = 1

We get the following results:

ID          Table1ID    Value
----------- ----------- --------------------------------------------------
1           1           test
1           1           hello

This is because the where clause limits the result set, so we are only including records from Table1 that have an ID of 1. However, if we move the condition to the ON clause:

SELECT * FROM @Table1 T1
LEFT OUTER JOIN @Table2 T2
    ON T1.ID = T2.Table1ID
    AND T2.Table1ID = 1

We get the following results:

ID          Table1ID    Value
----------- ----------- --------------------------------------------------
1           1           test
1           1           hello
2           NULL        NULL
3           NULL        NULL

This is because we are no longer filtering the result-set by the Table1's ID of 1 - rather we are filtering the JOIN. So, even though Table1's ID of 2 DOES have a match in the second table, it's excluded from the join - but NOT the result-set (hence the null values).

So, for inner joins it doesn't matter, but you should keep it in the where clause for readability and consistency. However, for outer joins, you need to be aware that it DOES matter where you put the condition as it will impact your result-set.

Audwin Oyong
  • 2,247
  • 3
  • 15
  • 32
Stan Shaw
  • 3,014
  • 1
  • 12
  • 27
3

I think answer marked as "right" is not right. Why? I try explain:

We have opinion

"Always keep the Join Conditions in ON clause Always put the filter's in where clause"

And this is wrong. If you are in inner join, every time put filter params in ON clause, not in where. You ask why? Try imagine complex query with total of 10 tables(f.e. every table has 10k recs) join, with complex WHERE clause(for example, functions or calculations used). If you put filtering criteria in ON clause, JOINS between these 10 tables not occurs, WHERE clause will not executed at all. At this case you are not performing 10000^10 calculations in WHERE clause. This make sense, not putting filtering params in WHERE clause only.

Juozas
  • 916
  • 10
  • 17
  • 2
    Databases that optimize queries could move the WHERE conditions into ON if performance is better there. In MySQL, "EXPLAIN " gives me the same output whether the condition is in WHERE or ON. – damjan Jun 08 '18 at 12:03