27

I'm experiencing a very confusing situation that makes me question all my understanding of joins in SQL Server.

SELECT t1.f2 
FROM   t1 
LEFT JOIN t2 
ON t1.f1 = t2.f1 AND cond2 AND t2.f3 > something 

Does not give the same results as :

SELECT t1.f2 
FROM   t1 
LEFT JOIN t2 
ON t1.f1 = t2.f1 AND cond2 
WHERE  t2.f3 > something 

Can please someone help by telling if this two queries are supposed to be equivalent or not?

Thx

John Woo
  • 258,903
  • 69
  • 498
  • 492
Kamel Keb
  • 373
  • 1
  • 4
  • 9
  • 3
    The `ON` reduces the set of rows that are joined, whereas the `WHERE` reduces the set of rows that are added to the result set... – Lucas Mar 29 '13 at 15:21
  • 3
    They are not equivalent. Your first query filters the results in the join; the second filters *after*. – Brad M Mar 29 '13 at 15:21
  • I'm not sure questions are equivalent because here, the condition is expressed upon the second table.. – Kamel Keb Mar 29 '13 at 15:32

5 Answers5

40

The on clause is used when the join is looking for matching rows. The where clause is used to filter rows after all the joining is done.

An example with Disney toons voting for president:

declare @candidates table (name varchar(50));
insert @candidates values 
    ('Obama'), 
    ('Romney');
declare @votes table (voter varchar(50), voted_for varchar(50));
insert @votes values 
    ('Mickey Mouse', 'Romney'),
    ('Donald Duck', 'Obama');

select  *
from    @candidates c
left join    
        @votes v
on      c.name = v.voted_for
        and v.voter = 'Donald Duck'

This still returns Romney even though Donald didn't vote for him. If you move the condition from the on to the where clause:

select  *
from    @candidates c
left join    
        @votes v
on      c.name = v.voted_for
where   v.voter = 'Donald Duck'

Romney will no longer be in the result set.

Andomar
  • 232,371
  • 49
  • 380
  • 404
  • 3
    Note that this example is using an OUTER join, which is the case in which these 2 code snippets are not functionally equivalent. – Brian Leeming Oct 11 '16 at 13:39
  • I was looking for something completely different, but somehow google lead me here and your fist two sentences saved my day! Thanks +1! – ksno Dec 01 '16 at 11:30
22

Both are literally different.

The first query does the filtering of table t2 before the joining of tables take place. So the results will then be join on table t1 resulting all the records of t1 will be shown on the list.

The second one filters from the total result after the joining the tables is done.


Here's an example

Table1

ID   Name
1    Stack
2    Over 
3    Flow

Table2

T1_ID   Score
1       10
2       20
3       30

In your first query, it looks like this,

SELECT  a.*, b.Score
FROM    Table1 a
        LEFT JOIN Table2 b
           ON a.ID = b.T1_ID AND
              b.Score >= 20

What it does is before joining the tables, the records of table2 are filtered first by the score. So the only records that will be joined on table1 are

T1_ID   Score
2       20
3       30

because the Score of T1_ID is only 10. The result of the query is

ID   Name    Score
1    Stack   NULL
2    Over    20
3    Flow    30

While the second query is different.

SELECT  a.*, b.Score
FROM    Table1 a
        LEFT JOIN Table2 b
           ON a.ID = b.T1_ID
WHERE   b.Score >= 20

It joins the records first whether it has a matching record on the other table or not. So the result will be

ID   Name    Score
1    Stack   10
2    Over    20
3    Flow    30

and the filtering takes place b.Score >= 20. So the final result will be

ID   Name    Score
2    Over    20
3    Flow    30
John Woo
  • 258,903
  • 69
  • 498
  • 492
1
CREATE TABLE Company
(
CompanyId TinyInt Identity Primary Key,
CompanyName Nvarchar(50) NULL
)
GO

INSERT Company VALUES('DELL')
INSERT Company VALUES('HP')
INSERT Company VALUES('IBM')
INSERT Company VALUES('Microsoft')
GO

CREATE TABLE Candidate
(
CandidateId tinyint identity primary key,
FullName nvarchar(50) NULL,
CompanyId tinyint REFERENCES Company(CompanyId)
)
GO

INSERT Candidate VALUES('Ron',1)
INSERT Candidate VALUES('Pete',2)
INSERT Candidate VALUES('Steve',3)
INSERT Candidate VALUES('Steve',NULL)
INSERT Candidate VALUES('Ravi',1)
INSERT Candidate VALUES('Raj',3)
INSERT Candidate VALUES('Kiran',NULL)
GO

SELECT * from Company c
SELECT * from Candidate c

-- A simple left outer Join
SELECT * FROM Company c LEFT OUTER JOIN Candidate c2
ON c.CompanyId = c2.CompanyId

--Left Outer Join ON and AND condition fetches 5 rows wtih NULL value from right side table 
SELECT * FROM Company c LEFT OUTER JOIN Candidate c2
ON c.CompanyId = c2.CompanyId
AND c.CompanyName = 'DELL' 

--Left Outer Join ON and where clause fetches only required rows
SELECT * FROM Company c LEFT OUTER JOIN Candidate c2
ON c.CompanyId = c2.CompanyId
AND c.CompanyName = 'DELL' 
WHERE c.CompanyName='IBM'
Vishwanath Dalvi
  • 35,388
  • 41
  • 123
  • 155
0

In the first case, results in t2 is filtered as part of the join.

In the second case, there could be more rows available from t2.

Essentially, the set of records joined in the two queries will not be the same.

Ryan
  • 26,884
  • 9
  • 56
  • 83
0

It does make a difference because in second case you are applying the where AFTER it does the left join

Brian
  • 1,337
  • 5
  • 17
  • 34