Most efficient way to return entries with no relationship to another table in MySQL?

Question

I'm developing a table reservation system, and would like to list all Restaurants that have not received any Reservation yet.

I've thought of the following approaches:

Using NOT IN():

SELECT * FROM Restaurant
WHERE id NOT IN(SELECT restaurantId FROM Reservation);

Using LEFT JOIN:

SELECT rst.* FROM Restaurant rst
LEFT JOIN Reservation rvn ON rvn.restaurantId = rst.id
WHERE rvn.id IS NULL;

I would be favouring the second approach for efficiency.

Is there a more efficient approach than these two?

Run `explain select ...` to see which is faster. Having many restaurants could make the `in` clause slow, but you have to see for yourself. You can also use `distinct`: `... id NOT IN(SELECT DISTINCT restaurantId FROM Reservation)` — juergen d, Jan 07 '14 at 16:42
Here is a nice explanation http://stackoverflow.com/questions/2577174/join-vs-sub-query — Abhik Chakraborty, Jan 07 '14 at 16:51
'explain select ...' DOES NOT TELL YOU WHICH IS FASTER it tells you what the opimizer thought was the best way to resolve a query. — symcbean, Jan 07 '14 at 16:56

score 2 · Answer 1 · answered Jan 07 '14 at 16:48

joins in MySQL are generally a good choice for performance. It is possible that a not exists would be the fastest method in this case:

SELECT *
FROM Restaurant r
WHERE not exists (select 1 from Reservation rv where rv.restaurantId = r.id);

For optimal performance, you need an index on Reservation(restaurantId).

score 1 · Accepted Answer · edited May 23 '17 at 10:25

For simple queries like the one in your example, the query optimizer will be able to resolve either query with the same execution plan:

If you use LEFT JOIN to find rows that do not exist in some table and you have the following test: col_name IS NULL in the WHERE part, where col_name is a column that is declared as NOT NULL, MySQL stops searching for more rows (for a particular key combination) after it has found one row that matches the LEFT JOIN condition.

In most cases, the optimizer is even able to use the smart trick that Gordon suggested previously:

Consider the following subquery comparison:

outer_expr IN (SELECT inner_expr FROM ... WHERE subquery_where)

[Provided that] an appropriate equality [is pushed down] into the subquery's WHERE clause (...) the comparison is converted to this:

EXISTS (SELECT 1 FROM ... WHERE subquery_where AND outer_expr=inner_expr)

My point is, I would recommend using the for you are most used to, and only try other options if you ever encountered an actual performance issue here.

Note: I am not advising *not* to try to dig deeper into the question :) Just because it is *unnecessary* does not make the question less interesting. — RandomSeed, Jan 07 '14 at 17:33

Most efficient way to return entries with no relationship to another table in MySQL?

2 Answers2