Problem with "NOT IN" in a simple SQL query

Question

I need to find all id's from the OldTable that doesn't exist in the NewTable
Why won't this query find the id?

SELECT old_id FROM OldTable WHERE old_id NOT IN (
SELECT id FROM NewTable)

By them selves they return this

--Returns the id 18571
SELECT old_id FROM OldTable WHERE old_id = 18571

 --Returns nothing  
SELECT id FROM NewTable WHERE id = 18571

Am I missing something obvious here?

Both columns are of type int and primary keys.

SOLVED

The id column had null's in them, I was just being ignorant =/

These works:

SELECT old_id FROM OldTable EXCEPT SELECT id FROM NewTable  

SELECT * FROM old_table ot WHERE NOT EXISTS (
SELECT * FROM new_table nt WHERE nt.id = ot.old_id)

These doesn't work:

SELECT old_id FROM OldTable LEFT JOIN NewTable ON old_id = id WHERE id IS NULL  

SELECT old_id FROM OldTable WHERE old_id NOT IN (
SELECT id FROM NewTable)

`SELECT id FROM NewTable` probably returns a `NULL`. This will mean your query won't return any results. — Martin Smith, Sep 22 '11 at 08:23
are the datatype of both columns exactly the same.. are you missing cast or maybe some small formatting adjustment? — , Sep 22 '11 at 08:29
The sub query doesn't return any nulls. Both tables has NOT NULL set on the id columns — Niklas, Sep 22 '11 at 08:46
Which database system do you use? Do you execute the queries directly on the database (for instance using SQLyog) or do you use some script? — user254875486, Sep 22 '11 at 08:49
@Niklas - Can you try `SELECT old_id FROM OldTable EXCEPT SELECT id FROM NewTable`? — Martin Smith, Sep 22 '11 at 08:55
Hey Martin, that worked just as well as wildplasser's query below. — Niklas, Sep 22 '11 at 08:58
@Niklas - In your `NOT IN` version might you have been accidentally referencing the column belonging to `OldTable` in the inner sub query instead of referencing the `id` column belonging to `NewTable`? — Martin Smith, Sep 22 '11 at 08:59
@Martin - Well there's something going on here but I don't know what =) (Don't think there's anything wrong in the references though) — Niklas, Sep 22 '11 at 09:07
@Martin - Thanks, you were right with your first comment, it had null's in the id column. I checked the wrong column when I looked for the `NOT NULL` setting. — Niklas, Sep 22 '11 at 09:17

score 2 · Accepted Answer · answered Sep 22 '11 at 08:29

2

SELECT * FROM old_table ot
WHERE NOT EXISTS (
   SELECT * FROM new_table nt
   WHERE nt.new_key = ot.old_key
   );

answered Sep 22 '11 at 08:29

wildplasser

43,142
8
66
109

1

The difference in behaviour between `NOT IN (subquery)` and `NOT EXISTS (subquery)` is due to the presence of nulls. – onedaywhen Sep 22 '11 at 09:02

onedaywhen · Answer 2 · 2011-09-22T09:24:22.887

2

The difference can be attributed to the presence of nulls.

Consider these two simplified queries, noting the predicate for both is NULL = 1 which evaluates to UNKNOWN which is handled differently by NOT EXISTS and NOT IN respectively:

SELECT *
  FROM OldTable
 WHERE NULL NOT IN (SELECT 1 FROM OldTable);

 SELECT *
  FROM OldTable
 WHERE NOT EXISTS (SELECT * FROM OldTable WHERE NULL = 1);

The first returns no rows because NOT IN (subquery) evaluated to FALSE.

The first returns all rows because NOT EXISTS (subquery) evaluated to TRUE.

Conclusion: avoid nulls.

edited Sep 22 '11 at 09:24

answered Sep 22 '11 at 09:07

onedaywhen

55,269
12
100
138

1

They say the columns are not nullable though. I suspect they are hitting the same issue as here http://stackoverflow.com/q/4594733/73226 – Martin Smith Sep 22 '11 at 09:11
@Martin Smith: but you are correct that it could have been the same issue as the one you linked to, one I hadn't come across myself (which really should result in a syntax error IMO!) so thanks for that. – onedaywhen Sep 22 '11 at 09:23

score 1 · Answer 3 · answered Sep 22 '11 at 08:24

1

I don't know why your query doesn't give you the desired result, but I do know that using NOT IN is not very efficient. You would be better of using a joins:

SELECT old_id 
FROM OldTable 
LEFT JOIN NewTable
    ON old_id = id
WHERE id IS NULL

answered Sep 22 '11 at 08:24

user254875486

11,190
7
36
65

1

"not very efficient" -- in what way, bearing in mind you don;t even know which SQL product they are using? – onedaywhen Sep 22 '11 at 08:49

score 0 · Answer 4 · answered Sep 22 '11 at 08:22

0

SELECT old_id FROM OLDTable WHERE id not in (SELECT id from NewTable);

You are using old_id in where condition in the first query and you are using id in second query

answered Sep 22 '11 at 08:22

Hari Gillala

11,736
18
70
117

Mistake by me when I entered the query on to SO. – Niklas Sep 22 '11 at 08:25

score 0 · Answer 5 · 2011-09-22T09:14:37.937

0

select oldtable.id as orginal, newtable.id as new 
from oldtable 
left outer join newtable 
on oldtable.id =newtable.id 
where new is null

AFAIK (and i am not an expert - check my rep ;-) left outer join is a good technique for this.. it will may perform a not in and does not require a sub-select

@onedaywhen kindly points out that this is not always the case e.g. in SQL Server EXISTS() can be more efficient.

edited Sep 22 '11 at 09:14

answered Sep 22 '11 at 08:23

1

"it will out perform a not in" -- I note your disclaimer but how can you know this if you don't even know which SQL product they are using? – onedaywhen Sep 22 '11 at 08:50
@onedaywhen very good point.. just hear say, its what I was told, sub select & not in = bad. Does that vary depending on the product/engine? – Sep 22 '11 at 08:54
1

Yes, among other things. `EXISTS()` is good for SQL Server because it *may* short circuit but may not be best every time. – onedaywhen Sep 22 '11 at 09:09
@onedaywhen oh, cool.. I will update the answer for what it is worth. BTW did you get caught out by new *at*username autocomplete? ;-) (or is that another case of me being a newbie?) – Sep 22 '11 at 09:11
1

Yes, I do keep getting caught out but in my defence it is a relatively recent change and I'm an old hand ;) – onedaywhen Sep 22 '11 at 09:27

Problem with "NOT IN" in a simple SQL query

5 Answers5