2

Query

SELECT ID, Name, Phone 
FROM Table1 
LEFT JOIN Table2 ON Table1.ID = Table2.ID
WHERE Table2.ID IS NULL

Problem

  • Finding it hard to understand why someone would left join on an ID and then set it to NULL in the where clause?
  • Am I missing something here? Is there any significance to this?
  • Could we just omit the Table2 altogether? As in not join at all?

Any help would be much appreciated.

Rahul Neekhra
  • 780
  • 1
  • 9
  • 39
AK91
  • 671
  • 2
  • 13
  • 35
  • 3
    Possible duplicate of [How to select rows with no matching entry in another table?](https://stackoverflow.com/questions/4076098/how-to-select-rows-with-no-matching-entry-in-another-table) – Igor Nov 19 '18 at 11:27
  • 3
    It is different way to find NOT EXISTS records. Sometimes this way works more efficient. – Zeki Gumus Nov 19 '18 at 11:28
  • 2
    Is is also a good way to find orphaned records (if the FK constraint is not enforced or not present) – Igor Nov 19 '18 at 11:28
  • What do you mean, "set it to null"? LEFT JOIN returns INNER JOIN rows plus unmatched left table rows extended by NULLs. Always know what INNER JOIN you want as part of a LEFT JOIN. A WHERE or INNER ON that requires a right table column to be NULL after a LEFT JOIN ON returns only rows extended by NULLs, ie leaves only unmatched rows. – philipxy Nov 19 '18 at 11:34
  • 1
    Often referred to as a left anti-join – Andrew Nov 19 '18 at 11:56
  • 1
    I'll agree with you OP - I don't understand why people do this (and a lot of other things) either. Zohar explains this well IMO. – SMor Nov 19 '18 at 13:17
  • 1
    Just an unrelated side-note. IRL, a foreign key that references another table often has something in the name that makes it clear to what table it points. For example : `SELECT * FROM Orders o JOIN Customers c ON c.ID = o.custID`. So it would be rare to see a query that joins ID to ID. – LukStorms Nov 19 '18 at 13:34
  • Thank you everyone - Appreciate all the answers – AK91 Nov 19 '18 at 15:49

4 Answers4

5

The query you have in the question is basically equivalent to the following query:

SELECT ID, Name, Phone 
FROM Table1 
WHERE NOT EXISTS
(
    SELECT 1
    FROM  Table2 
    WHERE Table1.ID = Table2.ID
)

Meaning it selects all the records in Table1 that does not have a correlated record in Table2.

The execution plan for both queries will most likely be the same (Personally, I've never seen a case when they produce a different execution plan, but I don't rule that out), so both queries should be equally efficient, and it's up to you to decide whether the left join or the exists syntax is more readable to you.

Zohar Peled
  • 79,642
  • 10
  • 69
  • 121
2

I think you should have an alias for you table and specify which table each column is coming from.

Assuming Name is from table one and Phone is form table two and ID is common in both, then the Left join mentioned above may help get all users that do not have phone numbers.

Table 1
Id Name
1 John Smith
2 Jane Doe

Table 2
Id Phone
2 071 555 0863

Left Join without the where clause
ID Name Phone
1 John Smith NULL
2 Jane Doe 071 555 0863

Left Join with the where clause
ID Name Phone
1 John Smith NULL

1

This is one of the ways to implement the relational database operation of antijoin, called anti semi join within sql server's terminology. This is essentially "bring rows from one table that are not in another table".

The ways I cant think of doing this are:

select cols from t1 left join t2 on t1.key=t2.key where t2.key is null

select cols from t1 where key not in (select key from t2)

select cols from t1 where not exists (select 1 from t2 where t1.key=t2.key)

and even

select * from t1 where key in (select key from t1 except select key from t2)

There are some differences between these methods (most notably, the danger of null handling in the case of not in), but they generally do the same.


To address your points:

Finding it hard to understand why someone would left join on an ID and then set it to NULL in the where clause?

As mentioned, in order to exclude results from t1 that are present in t2

Could we just omit the Table2 altogether? As in not join at all?

If you don't use the join (or any of its equivelant alternatives), you will get more results, as the rows in table1 that have the same id with any rows in table2 will be returned, too.

George Menoutis
  • 6,894
  • 3
  • 19
  • 43
1

If joining condition column is having null value specifically ID then it is bad database design per my understanding.

As per your query below. Here are the possible scnario why where clause make sense

  1. I am assuming that your name and phone number are coming from table2 and then you are trying to find the name and phone number whose ID is null.

  2. If name and phone number is coming from table1 and table 2 is just having ID join and not selecting anything from table 2 then where clause is total waste.

    SELECT ID, Name, Phone FROM Table1 LEFT JOIN Table2 ON Table1.ID = Table2.ID WHERE Table2.ID IS NULL

Essentially in the above common business scenario, developers put where clause filter criteria in left join when any value is coming from right side is having non relevance data and not required to be the part of dataset then filter it out.

Rahul Neekhra
  • 780
  • 1
  • 9
  • 39