Two selects or one select + one join in SQL?

Question

The following code snippets should do the same work.

SELECT t1.* FROM table1 t1
INNER JOIN table2 t2
ON t1.ID = t2.IDService
WHERE t2.Code = @code

and

SELECT * FROM table1 t1
WHERE t1.ID IN (SELECT IDService FROM table2 WHERE Code = @code)

Which one is the best solution, in general? And computationally, is better have two nested select or is better use inner join?

EDIT: Consider that the PK of table1 is ID and the PK of table2 id the couple (IDService,Code). So, fixing the Code (using WHERE clause) and applying the clause ON to IDService, I can assume that the result of each select are the same.

http://stackoverflow.com/questions/2577174/join-vs-sub-query — Gopesh Sharma, Nov 07 '13 at 12:08

Gopesh Sharma · Answer 1 · 2013-11-07T12:46:30.007

0

As a very general rule a JOIN almost always performs better than a SUB-QUERY, there are exceptions.

If you are to use a subquery, an EXIST clause, usually performs better than IN for the majority of use cases.

For Test Cases, you can look into this site

You Should Go with the following rule...

If you need data that is from more than one table, then you can always go with join.
You can use subqueries if you require more than one query and each subquery provides a subset of the table involved in the query.
If the query requires a NOT EXISTS condition, then you must use a subquery because NOT EXISTS operates only in a subquery; the same principle holds true for the EXISTS condition.
PROCEDURE SQL query optimizer changes some subqueries to joins, a join is generally more efficient to process.

edited Nov 07 '13 at 12:46

answered Nov 07 '13 at 12:11

Gopesh Sharma

6,730
4
25
35

*a JOIN almost always performs better than a SUB-QUERY*. This is true for MySQL because MySQL treats a subquery like a temporary table and cannot push the predicates to the outer query where it feels fit, however SQL Server can do this, so there is no blanket rules it really depends. – GarethD Nov 07 '13 at 12:59
@GarethD I totally agree with you, but I am not talking about query which will give less data , lets just think about a subquery giving me 1,00,000 values then join will definitely perform better – Gopesh Sharma Nov 07 '13 at 13:04

GarethD · Accepted Answer · 2013-11-07T12:57:19.900

Your notion that they should do the same work is not true. Imagine this test set of data:

T1

ID
----
1
2
3
4
5

T2

ID
---
1
1
1
2
2
3

DDL

CREATE TABLE dbo.T1 (ID INT NOT NULL);
INSERT dbo.T1 (ID) VALUES (1), (2), (3), (4), (5);

CREATE TABLE dbo.T2 (ID INT NOT NULL);
INSERT dbo.T2 (ID) VALUES (1), (1), (1), (2), (2), (3);

SELECT  *
FROM    dbo.T1
WHERE   T1.ID IN (SELECT T2.ID FROM dbo.T2);

SELECT  T1.*
FROM    dbo.T1
        INNER JOIN dbo.T2
            ON T1.ID = T2.ID;

Results

ID
---
1
2
3

ID
---
1
1
1
2
2
3

Your results are only the same if the column you are searching in is unique.

CREATE TABLE dbo.T1 (ID INT NOT NULL);
INSERT dbo.T1 (ID) VALUES (1), (2), (3), (4), (5);

CREATE TABLE dbo.T2 (ID INT NOT NULL);
INSERT dbo.T2 (ID) VALUES (1), (2), (3);

SELECT  *
FROM    dbo.T1
WHERE   T1.ID IN (SELECT T2.ID FROM dbo.T2);

SELECT  T1.*
FROM    dbo.T1
        INNER JOIN dbo.T2
            ON T1.ID = T2.ID;

enter image description here

Even though the results are the same, the execution plan isn't. The first query using IN is able to use an anti-semi join, meaning it knows that the data in t2 is not needed, so as soon as it finds a single match it can stop scanning for further matches.

If you constrain your second table to have only unique values, then you will see the same plan:

CREATE TABLE dbo.T1 (ID INT NOT NULL PRIMARY KEY);
INSERT dbo.T1 (ID) VALUES (1), (2), (3), (4), (5);

CREATE TABLE dbo.T2 (ID INT NOT NULL PRIMARY KEY);
INSERT dbo.T2 (ID) VALUES (1), (2), (3);

SELECT  *
FROM    dbo.T1
WHERE   T1.ID IN (SELECT T2.ID FROM dbo.T2);

SELECT  T1.*
FROM    dbo.T1
        INNER JOIN dbo.T2
            ON T1.ID = T2.ID;

enter image description here

In summary, the two queries will not always yield the same results, and they will not always have the same plan. It really depends on your Indexes and the width of your data/query.

Two selects or one select + one join in SQL?

2 Answers2

Linked