1

What is the difference between following two SQL queries

select a.id, a.name, a.country 
from table a 
left join table b on a.id = b.id
where a.name is not null

and

select a.id, a.name, a.country 
from table a 
left join table b on a.id = b.id and a.name is not null
marc_s
  • 732,580
  • 175
  • 1,330
  • 1,459
Developer
  • 817
  • 2
  • 16
  • 28

3 Answers3

1

There is no difference other than the syntax.

daveblake
  • 194
  • 8
  • 3
    While only a short sentence, this is still a valid answer. – sloth Mar 10 '14 at 15:20
  • And a very precise, accurate and relevant answer at that. – Erwin Smout Mar 10 '14 at 15:34
  • its just filtering on the ON condition instead of adding a WHERE filter. no difference of what is being done its just slower to not have a where.. his answer is completely sufficient – John Ruddell Mar 10 '14 at 16:57
  • think of it this way the rows in the table is getting compared to the other tables rows on two conditions (goes through it twice) when joining... instead of going through it once and then looking for non NULL results after the join. thats why its faster – John Ruddell Mar 10 '14 at 17:00
  • 2
    This is NOT a correct answer (though it is common misconception). Typically there will be no noticeable difference between these two commands because the Query Optimizer will rewrite the query to use the better syntax of the two. HOWEVER, there is a fundamental difference to the order in which these two queries will be run. The FROM clause is evaluated before the WHERE and will give radically different execution plans in more complicated queries. To verify, use HINTS to disregard the Optimizer changes and use the EXPLAIN PLAN to see the difference in execution path. –  Mar 10 '14 at 18:33
  • This is interesting - thanks for pointing that out. Correct me if I'm wrong, just been looking back at the query again and I think that even the results would be completely different with the above two queries. Query 1 would not return records where a.name is null. Query 2 would return records where a.name is null because table a is on the left side of the join. If we were selecting a column from table b we would see the results show NULL even if we matched on id. – daveblake Mar 11 '14 at 08:54
1

Base on the following two test result

select a.id, a.name,a.country from table a left join table b
on a.id = b.id
where a.name is not null 

is faster (237 Vs 460). As far as I know, it is a standard.

enter image description here

enter image description here

Win
  • 61,100
  • 13
  • 102
  • 181
-2
  1. Performs the join, then the filter
  2. Performs the filter before the join (better performance)

EDIT: Above was my original answer, below supports it.

4 Tables, 4 left joins, about 500,000 results and the second query runs in half the time. You should choose a larger working set when trying to test out query efficiency ... or factor in CPU load, RAM usage, connection time a few other things. If you are going to have less than 1,000 records or limited traffic optimization will be hard to see or justify. It's simple to test over your working set on a production environment and use what performs better (not just in theory).

http://www.beaudurrant.com/images/sof/22302649.jpg

You can see the differences in both efficiency and results over my datasets. This example is using a mySQL database.

  1. 452,734 records in 420.016 seconds
  2. 452,747 records in 214.334 seconds

Note: Queries were made to run slow on purpose.

Beau
  • 73
  • 5
  • I can post examples of this with 10,000,000 records on 4+ table (INNER) joins... In my case it was faster to add the AND to each join then wait until then end with one WHERE. 30+ seconds faster ... – Beau Mar 10 '14 at 15:06
  • The syntactic appearance of the expression has nothing to do with the strategy that is actually used to compute the result. Never heard of the term "query optimizer" ? – Erwin Smout Mar 10 '14 at 15:32
  • Did either of you care to look at my results? Do you work with large data sets? – Beau Mar 10 '14 at 16:35
  • Erwin - you are incorrect.. the syntactic appearance has EVERYTHING to do with the strategy of the query. The Optimizer simply knows how to write a better query than you do (based on complicated analysis of data and metadata) and so changes the execution plan to write what you meant rather than what you wrote. This is the reason why an execution plan will change over time as the data inside the tables changes. But the order in which you write your query DOES factor in to the optimizers compilations. –  Mar 10 '14 at 18:41
  • Erwin - Is that MS? I do work with mySQL (mainly), Berkeley, SQLite and MSSQL. This example is mySQL although I can show the same result on MSSQL (different tables and data). I enjoy doing things rather than letting MS make decisions for me =). – Beau Mar 10 '14 at 19:08