0

In my case, I have two table with same structure: TableA & TableB, and what I was trying to do is to find if there is any records only exists in A but not B.

My script was

SELECT * FROM TableA
WHERE NOT EXISTS (
    SELECT * FROM TableB
)

While there is 2 records which only exists in A but not B, this script returns nothing. Then I changed into following:

SELECT ID FROM TableA
WHERE ID NOT IN (
    SELECT ID FROM TableB
)

This script works successfully and return the 2 records' ID.


My question is: Is this behavior normal? What is the mechanism behind NOT EXISTS and NOT IN?

I have read some other posts comparing NOT EXISTS and NOT IN, and most people suggest using NOT EXISTS in 99.9% scenarios, is this case fall into that 0.1% which NOT EXISTS is not applicable? (I believed it's due to my wrongly usage though, please correct me if that's the case)

shole
  • 4,046
  • 2
  • 29
  • 69
  • 3
    The first query lacks the condition, like `WHERE TableB.id = TableA.id` (inside the nested query). So then the query literally is read as: select everything from TableA where the row from TableB with the same id does not exist. – zerkms Mar 15 '17 at 02:46
  • "is this case fall into that 0.1% which NOT EXISTS is not applicable" --- I would say that **every** query with `IN` might be expressed with `EXISTS`. The opposite is not true though (is it?) – zerkms Mar 15 '17 at 02:48

2 Answers2

3

If you want to look at all the values in the rows, then use EXCEPT:

SELECT *
FROM TableA
EXCEPT
SELECT *
FROM TableB;

If you want to use NOT EXISTS correctly, then you need a correlation clause:

SELECT a.*
FROM TableA a
WHERE NOT EXISTS (SELECT 1 FROM TableB b WHERE b.id = a.id);

I strongly recommend using NOT EXISTS over NOT IN with a subquery. NOT IN will return no rows at all if b.id is ever NULL. That is usually not what is intended. NOT EXISTS matches the expected semantics.

shole
  • 4,046
  • 2
  • 29
  • 69
Gordon Linoff
  • 1,242,037
  • 58
  • 646
  • 786
  • Thanks, I wonder why `NOT IN` ever existed if `NOT EXISTED` is a super set in usage as @zerkms mentioned... – shole Mar 15 '17 at 02:59
  • @shole . . . First, they have different semantics (subtle differrence, but different). Second, `NOT IN` works with constant lists. Third, SQL does not claim to do things only one way. There are other methods including `<> ALL` and a `left join`. That's just how the language was designed. – Gordon Linoff Mar 15 '17 at 11:39
1

You need to be careful with the NOT IN expression. The A NOT IN(B,C,D) expression basically means (A<>B AND A<>C AND A<>D). If any of the values are NULL the whole expression will become NULL.

So, applicable to your example the correct NOT IN expression should be (unless the ID is not nullable column):

SELECT ID FROM TableA
WHERE ID NOT IN (
    SELECT ID FROM TableB WHERE ID IS NOT NULL
)
cha
  • 10,301
  • 1
  • 18
  • 26
  • Thanks, I just found this post which makes things more clear to me: http://stackoverflow.com/questions/129077/not-in-clause-and-null-values – shole Mar 15 '17 at 03:04