Your two queries do not do the same thing. For the second approach to be equivalent, you need select distinct
:
select distinct id_user
from table1 t1 join
tmp_table tmp
on t1.foo = tmp.foo;
Because of this additional operation, I might expect the in
to perform better. But, in general, when you have a specific performance question, you should test it on your data on your system.
As for your queries in the question, there are many different ways to approach this. Here are a few:
- Subquery with
in
- Two tables with a
join
and distinct
- Two tables with
in
- Two tables with exists
- Subquery with
exists
- Two tables with
exists
- CTE with a
join
- CTE with
in
- CTE with
exists
In an ideal world, SQL compilers would simply study the query and get the best execution plan, regardless of how the query is expressed. That world is not the world we live in, alas.
One reason a temporary table is sometimes useful (and I much prefer single query solutions) is for optimization purposes:
- The statistics about the temporary table are known, so the optimizer can choose a better plan.
- You can build indexes on temporary tables to improve performance.
Your subquery is not very complicated, so these probably are not issues.
Under different circumstances, different approaches might work better. As a default, I would build an index on tmp_Table(dt_signin, foo)
and use exists
.