SELECT with JOIN where joined row is NULL

Question

I am trying to select rows from a table which don't have a correspondence in the other table.

For this purpose, I'm currently using LEFT JOIN and WHERE joined_table.any_column IS NULL, but I don't think that's the fastest way.

SELECT * FROM main_table mt LEFT JOIN joined_table jt ON mt.foreign_id=jt.id WHERE jt.id IS NULL

This query works, but as I said, I'm looking for a faster alternative.

That is the way to go. Try `explain select ...` to see where you might need to add indexes. — juergen d, Jan 23 '14 at 12:43
possible duplicate of [Can I get better performance using a JOIN or using EXISTS?](http://stackoverflow.com/questions/227037/can-i-get-better-performance-using-a-join-or-using-exists) — Alma Do, Jan 23 '14 at 12:45
@juergend the problem is that my actual query is much longer and much more complex, so it's so slow that the MySQL server times out after 10 minutes. So I'm trying to find optimizations on pieces of the query. — Eduard Luca, Jan 23 '14 at 12:45

score 1 · Accepted Answer · answered Jan 23 '14 at 12:45

Your query is a standard query for this:

SELECT *
FROM main_table mt LEFT JOIN
     joined_table jt
     ON mt.foreign_id=jt.id
WHERE jt.id IS NULL;

You can try this as well:

SELECT mt.*
FROM main_table mt
WHERE not exists (select 1 from joined_table jt where mt.foreign_id = jt.id);

In some versions of MySQL, it might produce a better execution plan.

score 1 · Answer 2 · answered Jan 27 '14 at 20:41

In my experience with MSSQL the syntax used (usually) produces the exact same query plan as the WHERE NOT EXISTS() syntax, however this is mysql, so I can't be sure about performance!!

That said, I'm a much bigger fan of using the WHERE NOT EXISTS() syntax for the following reasons :

it's easier to read. If you speak a bit of English anyone can deduce the meaning of the query
it's more foolproof, I've seen people test for NULL on a NULL-able field
it can't have side effects like 'doubled-records' due to the JOIN. If the referenced field is unique there is no problem, but again I've seen situations where people chose 'insufficient keys' causing the main-table to get multiple hits against the joined table... and off course they solved it again using DISTINCT (aarrgg!!! =)

As for performance, make sure to have a (unique) index on the referenced field(s) and if possible put a FK-relationship between both tables. Query-wise I doubt you can squeeze much more out of it.

My 2 cents.

Thanks. Would it be a huge deal in terms of performance if I have an index on both columns, but no FK relationship? — Eduard Luca, Feb 26 '14 at 13:31
As with a lot of db stuff : the only way to find out is by trying it. — deroby, Feb 26 '14 at 15:41

score 0 · Answer 3 · answered Jan 23 '14 at 19:04

The query that you are running is usually the fastest option, just make sure that you have an index forh both mt.foreign_id and jt.id.

You mentioned that this query is more complex, so it might be possible that the problem is in another part of the query. You should check the execution plan to see what is wrong and fix it.

SELECT with JOIN where joined row is NULL

3 Answers3