mysql - "INNER JOIN" or "IN". Which is faster?

Question

could you please tell me which one of these two sentences is faster?

1st sentence:

SELECT DISTINCT(t1.user_id) FROM table_users as t1 WHERE t1.user_town IN (SELECT t2.town FROM table_towns as t2)

2nd sentence:

SELECT DISTINCT(t1.user_id) FROM table_users as t1 
  INNER JOIN (SELECT t2.town FROM table_towns as t2) as t3 ON t1.user_town = t3.town

The Selects I use are usually a bit more complex. And when possible I try to use IN (...) because I can pass a value or a table select.

I've changed the result from SELECT t1.* to SELECT DISTINCT(t1.user_id) to avoid different results. Since it is usually what I really care — Seeker, Oct 25 '11 at 17:29

score 2 · Accepted Answer · answered Oct 25 '11 at 16:38

They both look fairly similar to my eyes. I'm pretty sure that a single SELECT will provide better performance, given that MySQL is not particularly good with subqueries:

SELECT t1.*
FROM table_users t1 
INNER JOIN table_towns t2 ON t1.user_town = t2.town

In any case, you should see what the EXPLAIN plan for the query has to say and perform some benchmarking on live data.

score 0 · Answer 2 · answered Oct 25 '11 at 16:36

0

You should measure it, and issue an explain plan to be sure. But I would use neither of those:

select u.* from table_users as u
inner join table_towns as t on u.user_town = t.town

answered Oct 25 '11 at 16:36

JB Nizet

678,734
91
1,224
1,255

score 0 · Answer 3 · answered Oct 25 '11 at 16:43

0

IN is killing for MySQL. Rather use EXISTS.

INNER JOIN can be pretty fast, but the results are not necessarily the same. You may need grouping or distinct to get the same (or similar) results, though in this case I think you're safe.

If using an inner join, don't make it a subselect, because MySQL doesn't handle those well either. Just join on the table_towns table itself.

answered Oct 25 '11 at 16:43

GolezTrol

114,394
18
182
210

That's no longer true for the lastest MySQL versions. `IN (select ....)` has been optimized. – Johan Oct 25 '11 at 18:24
Maybe not killing anymore, but still much slower in comparison. – GolezTrol Oct 26 '11 at 05:57
@Johan - What version is this effective from? Would be interested to hear (**from actual testing not just what you believe to be true!**) whether the huge disparity between `=` and `IN` that [is apparent here](http://stackoverflow.com/questions/3416076/this-select-query-takes-180-seconds-to-finish/3417098#3417098) is fixed. Mark Byer's answer I link to in the comments there says this isn't scheduled to be fixed until version 6.0 – Martin Smith Oct 26 '11 at 17:51

score 0 · Answer 4 · answered Oct 25 '11 at 16:44

There's no definite answer to your question. It really depends on what are the indexes on the tables and whether the DBMS you are using can use them or not. I've had instances where an IN reduces the execution time compared to INNER JOIN and instances where INNER JOIN is faster.

Anecdotal example (using SQL Server, though):

I was joining a temp table to a regular table using a common field. The execution plan showed a HASH JOIN between the two tables when I looked at the execution plan. Execution time was 2 secs aprox. I changed the INNER JOIN to IN with a subselect ,the HASH JOIN disappeared and the execution time was cut to 1 sec aprox. It was now using the index on the main table.

score 0 · Answer 5 · answered Oct 25 '11 at 16:48

0

The 2 queries you posted may produce different results (second with JOIN can return more rows than the first), so I think it's not correct to compare performance.

answered Oct 25 '11 at 16:48

a1ex07

36,826
12
90
103

you're right. I've just put distinct of only 1 field to avoid different results. – Seeker Oct 25 '11 at 17:33
In this case I believe the second(with JOIN) may be a bit worse (if optimizer is not smart enough to move ,`DISTINCT` to the subquery – a1ex07 Oct 25 '11 at 18:00

mysql - "INNER JOIN" or "IN". Which is faster?

5 Answers5