0

I have a table that contains the following columns:

  1. ID
  2. Master ID

The Master ID can be shared between different rows with different IDs.

E.g.:

ID | Master ID
1  |  1
2  |  1
3  |  1
4  |  2

Knowing the ID I want to retrieve all the rows that share the same master ID

I managed to do it using this query:

Select * 
FROM table t
LEFT JOIN table t2
ON t.MASTER_ID = t2.MASTER_ID 
Where t.ID = '1'

Then I also tried using:

Select * 
FROM table t
LEFT JOIN table t2
ON t.MASTER_ID = t2.MASTER_ID and t.ID = '1'

In that case, it was much slower. Can anyone explain why?

jarlh
  • 42,561
  • 8
  • 45
  • 63
Antoine Lefebvre
  • 346
  • 3
  • 10

4 Answers4

3

The queries are doing different things, the first you are saying:

1. give me all rows from `table` where `id = 1`
2. Also give me rows from t2 with a matching master ID

In the second you are saying

1. Give me all rows from `table`
2. Return rows from `t2` with a matching master ID and where `t1.ID = 1`

In a simple example you might have

ID  Master_ID
------------------------
1   1
2   1
3   1
4   2

So your first query will return:

t1.ID   t1.Master_ID    t2.ID   t2.Master_ID
--------------------------------------------
1           1           1       1
1           1           2       1
1           1           3       1

Your second query will return

t1.ID   t1.Master_ID    t2.ID   t2.Master_ID
--------------------------------------------
1           1           1       1
1           1           2       1
1           1           3       1
2           1           NULL    NULL
3           1           NULL    NULL
4           2           NULL    NULL

So basically in the first query you are returning a limited number of rows from your table, whereas in the second you return all rows, but only join to some of them.

GarethD
  • 68,045
  • 10
  • 83
  • 123
  • Good catch :-). I assume this difference between `ON` and `WHERE` clause is only pronounced when there is an `OUTER` join. For `INNER` joins the predicates could be either `ON` or `WHERE`. Is that right? A sqlfiddle would be nice. Thanks. – toddlermenot Sep 14 '15 at 14:46
  • That is correct, with an inner join both queries are equivalent. http://sqlfiddle.com/#!4/9bf52/1 – GarethD Sep 14 '15 at 14:52
  • Thanks for the explanation! So basically the Where clause applies to the first table selected and not to the result of the join, right? I guess I missed this detail... – Antoine Lefebvre Sep 14 '15 at 15:03
  • explanation on this page: http://stackoverflow.com/questions/1018822/inner-join-on-vs-where-clause made it clearer! – Antoine Lefebvre Sep 14 '15 at 15:34
2

If the t.ID = '1' condition is in the WHERE clause the t.ID='1' condition only has to be evaluated for the number of rows in t. If the t.ID='1' condition is put into the ON clause for the join it must be evaluated for all rows in t2. If there are a lot of rows in t2 this can significantly increase the run time of the query.

  • Absolutely, but putting the condition in join on ... means the condition will be evaluated after joins matching operation. (or) before? – Rahul Sep 14 '15 at 14:44
  • Having the condition in the ON clause for the join means the condition will be evaluated for each row in t2. It's evaluated to determine if the t2 row should be joined - so it's evaluated before the row is actually joined. Hope this helps. – Bob Jarvis - Слава Україні Sep 14 '15 at 15:16
1

You shouldn't include t.ID = '1' in JOIN ON condition since it's not the joined table. condition on the table in FROM part should stay in WHERE clause; whereas condition belongs to joined table should be moved to join on clause so to get a proper outer join effect rather a inner join effect.

Rahul
  • 76,197
  • 13
  • 71
  • 125
  • But doesn't the Oracle optimizer detect that? – jarlh Sep 14 '15 at 14:37
  • @jarlh, should be but what optimizer will choose to perform not sure. even if you try to do this same in SQL Server probably same feeling caan be seen. At least I have seen ... not sure whether optimizer will neutralize it or not. – Rahul Sep 14 '15 at 14:40
  • 1
    good observation, but the ON clause is a valid syntax to 1) perform an outer join for ID=1 and 2) preserve all other IDs filled with NULLs for T2. The real reason NOT to use it is that the resulting performance is much worst than for two queries performing 1 and 2 separately. – Marmite Bomber Sep 14 '15 at 18:13
0

You don't need an OUTER JOIN. The reason is simple, your are joining the same table on the same column - there can't be a non-match!

The query to use is therefore

 Select * 
 FROM T
 INNER  JOIN T t2    
 ON T.MasterID = t2.MasterID 
 Where t.ID = 1 
Marmite Bomber
  • 19,886
  • 4
  • 26
  • 53