4

I need to find all id's from the OldTable that doesn't exist in the NewTable
Why won't this query find the id?

SELECT old_id FROM OldTable WHERE old_id NOT IN (
SELECT id FROM NewTable)  

By them selves they return this

--Returns the id 18571
SELECT old_id FROM OldTable WHERE old_id = 18571

 --Returns nothing  
SELECT id FROM NewTable WHERE id = 18571 

Am I missing something obvious here?

Both columns are of type int and primary keys.

SOLVED

The id column had null's in them, I was just being ignorant =/

These works:

SELECT old_id FROM OldTable EXCEPT SELECT id FROM NewTable  

SELECT * FROM old_table ot WHERE NOT EXISTS (
SELECT * FROM new_table nt WHERE nt.id = ot.old_id)  

These doesn't work:

SELECT old_id FROM OldTable LEFT JOIN NewTable ON old_id = id WHERE id IS NULL  

SELECT old_id FROM OldTable WHERE old_id NOT IN (
SELECT id FROM NewTable)
Niklas
  • 13,005
  • 23
  • 79
  • 119
  • 4
    `SELECT id FROM NewTable` probably returns a `NULL`. This will mean your query won't return any results. – Martin Smith Sep 22 '11 at 08:23
  • are the datatype of both columns exactly the same.. are you missing cast or maybe some small formatting adjustment? –  Sep 22 '11 at 08:29
  • The sub query doesn't return any nulls. Both tables has NOT NULL set on the id columns – Niklas Sep 22 '11 at 08:46
  • Which database system do you use? Do you execute the queries directly on the database (for instance using SQLyog) or do you use some script? – user254875486 Sep 22 '11 at 08:49
  • @Niklas - Can you try `SELECT old_id FROM OldTable EXCEPT SELECT id FROM NewTable`? – Martin Smith Sep 22 '11 at 08:55
  • Hey Martin, that worked just as well as wildplasser's query below. – Niklas Sep 22 '11 at 08:58
  • @Niklas - In your `NOT IN` version might you have been accidentally referencing the column belonging to `OldTable` in the inner sub query instead of referencing the `id` column belonging to `NewTable`? – Martin Smith Sep 22 '11 at 08:59
  • @Martin - Well there's something going on here but I don't know what =) (Don't think there's anything wrong in the references though) – Niklas Sep 22 '11 at 09:07
  • @Niklas - Please post your actual queries. – Martin Smith Sep 22 '11 at 09:08
  • 1
    @Martin - Thanks, you were right with your first comment, it had null's in the id column. I checked the wrong column when I looked for the `NOT NULL` setting. – Niklas Sep 22 '11 at 09:17

5 Answers5

2
SELECT * FROM old_table ot
WHERE NOT EXISTS (
   SELECT * FROM new_table nt
   WHERE nt.new_key = ot.old_key
   );
wildplasser
  • 43,142
  • 8
  • 66
  • 109
  • 1
    The difference in behaviour between `NOT IN (subquery)` and `NOT EXISTS (subquery)` is due to the presence of nulls. – onedaywhen Sep 22 '11 at 09:02
2

The difference can be attributed to the presence of nulls.

Consider these two simplified queries, noting the predicate for both is NULL = 1 which evaluates to UNKNOWN which is handled differently by NOT EXISTS and NOT IN respectively:

SELECT *
  FROM OldTable
 WHERE NULL NOT IN (SELECT 1 FROM OldTable);

 SELECT *
  FROM OldTable
 WHERE NOT EXISTS (SELECT * FROM OldTable WHERE NULL = 1);

The first returns no rows because NOT IN (subquery) evaluated to FALSE.

The first returns all rows because NOT EXISTS (subquery) evaluated to TRUE.

Conclusion: avoid nulls.

onedaywhen
  • 55,269
  • 12
  • 100
  • 138
  • 1
    They say the columns are not nullable though. I suspect they are hitting the same issue as here http://stackoverflow.com/q/4594733/73226 – Martin Smith Sep 22 '11 at 09:11
  • @Martin Smith: but you are correct that it could have been the same issue as the one you linked to, one I hadn't come across myself (which really should result in a syntax error IMO!) so thanks for that. – onedaywhen Sep 22 '11 at 09:23
1

I don't know why your query doesn't give you the desired result, but I do know that using NOT IN is not very efficient. You would be better of using a joins:

SELECT old_id 
FROM OldTable 
LEFT JOIN NewTable
    ON old_id = id
WHERE id IS NULL
user254875486
  • 11,190
  • 7
  • 36
  • 65
  • 1
    "not very efficient" -- in what way, bearing in mind you don;t even know which SQL product they are using? – onedaywhen Sep 22 '11 at 08:49
0
SELECT old_id FROM OLDTable WHERE id not in (SELECT id from NewTable);

You are using old_id in where condition in the first query and you are using id in second query

Hari Gillala
  • 11,736
  • 18
  • 70
  • 117
0
select oldtable.id as orginal, newtable.id as new 
from oldtable 
left outer join newtable 
on oldtable.id =newtable.id 
where new is null

AFAIK (and i am not an expert - check my rep ;-) left outer join is a good technique for this.. it will may perform a not in and does not require a sub-select

@onedaywhen kindly points out that this is not always the case e.g. in SQL Server EXISTS() can be more efficient.

  • 1
    "it will out perform a not in" -- I note your disclaimer but how can you know this if you don't even know which SQL product they are using? – onedaywhen Sep 22 '11 at 08:50
  • @onedaywhen very good point.. just hear say, its what I was told, sub select & not in = bad. Does that vary depending on the product/engine? –  Sep 22 '11 at 08:54
  • 1
    Yes, among other things. `EXISTS()` is good for SQL Server because it *may* short circuit but may not be best every time. – onedaywhen Sep 22 '11 at 09:09
  • @onedaywhen oh, cool.. I will update the answer for what it is worth. BTW did you get caught out by new *at*username autocomplete? ;-) (or is that another case of me being a newbie?) –  Sep 22 '11 at 09:11
  • 1
    Yes, I do keep getting caught out but in my defence it is a relatively recent change and I'm an old hand ;) – onedaywhen Sep 22 '11 at 09:27