3

If I'm selecting equal fields from two tables, does it matter which table to SELECT from and which to JOIN from?

I guess I'm really interested in how best to join tables. Are there guidelines for which to SELECT FROM and which to JOIN, or when it doesn't matter?

I'm working on a query that joins about 5 tables and not sure if it matters which I pull from and which I join. Currently I'm using FROM the table my WHERE clause pulls from. I assume this is the correct way, but any explanations or guidelines would be great!

Same applies for LEFT JOIN, as the feedback I see is that the default INNER JOIN does not matter.

For example

SELECT
  a.field1,
  a.field2,
  a.field3,
  b.field1,
  b.field2

choice 1:

FROM 
  a
JOIN
  b ON a.field1 = b.field1
WHERE
  a.field1 = 'abc'

Choice 2:

FROM 
  b
JOIN
  a ON b.field1 = a.field1
WHERE
  a.field1 = 'abc'
d-_-b
  • 21,536
  • 40
  • 150
  • 256
  • 1
    For an INNER JOIN it should not matter and both forms are semantically equivalent .. but query planners can be surprising. I normally put the "primary/parent" table in the FROM for consistency/clarity. – user2864740 Jul 22 '14 at 23:50
  • possible duplicate of [Does the join order matters in SQL?](http://stackoverflow.com/questions/9614922/does-the-join-order-matters-in-sql) – Mike B Jul 22 '14 at 23:50
  • @MikeB My question is focused around the `WHERE` table vs `JOIN` tables. I don't think that questions is the same. – d-_-b Jul 22 '14 at 23:52
  • @d-_-b Still doesn't - semantically - matter for an INNER JOIN. – user2864740 Jul 22 '14 at 23:52
  • thanks @user2864740, that makes sense. what about for left joins? I'll add that to the question – d-_-b Jul 22 '14 at 23:53
  • I think the other question/answers apply. The WHERE can be trivially moved to before the JOIN in the relational algebra in this case (because it only depends upon one table). – user2864740 Jul 22 '14 at 23:55
  • Using `WHERE` in your query that refers to the joined table turns it implicitly into an `INNER JOIN` – Ryan J Jul 22 '14 at 23:56
  • @RyanJ Not necessarily.. – user2864740 Jul 22 '14 at 23:56
  • The OP's question seems to be more related to the usage of the `WHERE` clause, and which table you put there, not necessarily the order of the joins – Ryan J Jul 22 '14 at 23:59
  • 1
    I suspect you might benefit from looking at this post as well. http://stackoverflow.com/questions/354070/sql-join-where-clause-vs-on-clause – Ryan J Jul 23 '14 at 00:01

1 Answers1

3

There is a no functional difference for inner joins, but there is a functional difference for outer joins. (Does the join order matter in SQL?)

^^ A good way of picturing why is this: a left outer join between A and B is the same thing as a right outer join between B and A. Direction is very important to outer joins. In fact that is why you don't have to say the keyword 'outer' in 'left join' or 'right join', the database knows you want an outer join because otherwise there would be no point in specifying direction.

If your question is about performance, yes, there can potentially be an advantage to specifying smaller tables first, and subsequently joining into larger tables (rather than the other way around). However it's not always possible to optimize that way depending on how the tables are related.

Oracle for instance has the LEADING hint which can be used to instruct the database to start with a specified table first even though it's not the first listed in the FROM clause (http://docs.oracle.com/cd/B13789_01/server.101/b10752/hintsref.htm#30459).

Community
  • 1
  • 1
Brian DeMilia
  • 13,103
  • 1
  • 23
  • 33
  • "smaller tables *first*" --- query optimizer is free to reorder them – zerkms Jul 22 '14 at 23:56
  • I have run into situations where performance was significantly improved either by changing the order of the tables in the from clause or by using the leading hint (in an Oracle database). Statistics are not always kept up to date in every database, for one thing – Brian DeMilia Jul 22 '14 at 23:59
  • For a database where statistics aren't kept up-to-date, I'd be less concerned about performance. It could be reasonable to infer that performance is not a priority since statistics may affect queries all through any application. – user2338816 Jul 23 '14 at 00:54