1

As a follow up to In SQL / MySQL, what is the difference between "ON" and "WHERE" in a join statement? and SQL join: where clause vs. on clause - it does matter if a condition is placed in the on-clause vs. the where-clause in an outer join.

However, does it matter which on-clause the condition is placed in when there are multiple outer joins?

For example, could these produce different results?

select * from t1 left join t2 on t1.fid=t2.id and t2.col=val
                 left join t3 on t2.fid=t3.id;

vs:

select * from t1 left join t2 on t1.fid=t2.id
                 left join t3 on t2.fid=t3.id and t2.col=val;
Community
  • 1
  • 1
Arnon Weinberg
  • 871
  • 8
  • 20

3 Answers3

1

Absolutely they are different.

The fisrt query will only have t2 rows that satisfy t2.col=val

The second query will include all t2 rows and only list t3 when t2.col=val

Jasen
  • 11,837
  • 2
  • 30
  • 48
  • Excellent, thanks @Jasen. The way I'm thinking of it is that if you want to filter the left-most table (and on) - use "where", the next table - use the left-most "on", then the next "on", etc. So the order of filtering is: where, on #1, on #2 ... – Arnon Weinberg Dec 29 '14 at 18:04
  • yes, usually `where` only mentions the starting table, if you use columns from other tables in `where` it often implies a `is not null` on the column from that table turning the `left join` into an `inner join`. – Jasen Dec 29 '14 at 19:05
1

The queries are not equivalent. It is easy to construct a counter example:

create table t1 (id int not null, val int not null);
create table t2 (id int not null, val int not null);
create table t3 (id int not null, val int not null);
insert into t1 (id, val) values (1,1);
insert into t2 (id, val) values (1,1);
insert into t3 (id, val) values (1,1);

select * from t1 
left join t2 
    on t1.id = t2.id 
    and t2.val = 2 
left join t3 
    on t2.id = t3.id;
+----+-----+------+------+------+------+
| id | val | id   | val  | id   | val  |
+----+-----+------+------+------+------+
|  1 |   1 | NULL | NULL | NULL | NULL |
+----+-----+------+------+------+------+

select * from t1 
left join t2 
    on t1.id = t2.id 
left join t3 
    on t2.id = t3.id 
    and t2.val = 2;
+----+-----+------+------+------+------+
| id | val | id   | val  | id   | val  |
+----+-----+------+------+------+------+
|  1 |   1 |    1 |    1 | NULL | NULL |
+----+-----+------+------+------+------+
Lennart - Slava Ukraini
  • 6,936
  • 1
  • 20
  • 32
0

This is one of the queries:

select *
from t1 left join
     t2
     on t1.fid = t2.id left join
     t3
     on t2.fid = t3.id and t2.col = val;

Yes, the results are different. If you were using inner join they would be the same, but the left join changes things -- because the join clause does not do any filtering of rows.

I think the simplest explanation is that the join between t1 and t2 will include all rows from t1 as well as all matching rows from t2 -- even those where t2.col <> val. These remain in the result set, because the next left join does not filter them out.

In fact, the condition t2.col = val in the second on clause does not affect which rows are in the result set. If there is a match, then the row from t3 stays based on the first condition. If there is no match, then the row from t3 is still in the result set -- but the t3 columns will be NULL.

In this version:

select *
from t1 left join
     t2
     on t1.fid = t2.id  and t2.col = val left join
     t3
     on t2.fid = t3.id;

The first join gets all rows from t1 and only matching rows from t2 where t2.col = val. The third join can then add more rows.

Note: there are definitely situations where the two queries would return the same results. But, the following data would generate different results (assume val = 0):

t1

fid
1

t2

fid   col
1     0
1     1

t3

id 
1

The query with the condition in the second on clause will return:

1    1    0    1
1    1    1    NULL

The query with the condition in the first on clause will return:

1    1    0    1
Gordon Linoff
  • 1,242,037
  • 58
  • 646
  • 786