2

Possible Duplicate:
SQL NOT IN not working

Hey All,

I have the following two queries:

select distinct ID from db2.tran tr
where ID not in(
select distinct id from db2.tran
join (select distinct id from db1.client) as c on c.id = tr.id)


select distinct id from db2.tran tr
where id not in (select distinct id from db1.client)

Now in my mind, these queries should be equivelant, however, the first one returns rows, and the second one does not. Am i going crazy or is thier logic behind why these two are not coming out the same.

Thanks

Community
  • 1
  • 1
Limey
  • 2,642
  • 6
  • 37
  • 62
  • 3
    Does `db1.client` have any `NULL` ids? – Martin Smith Apr 15 '11 at 14:31
  • yes, it does, but I'm not sure why that would cause i difference – Limey Apr 15 '11 at 14:32
  • 4
    `WHERE X NOT IN (1,2,NULL)` etc. always returns zero rows as it is equivalent to `X<>1 AND X<>2 AND X<>NULL` and anything `AND UNKNOWN` is `UNKNOWN` not true. – Martin Smith Apr 15 '11 at 14:33
  • `NOT IN` does not handle `NULL` as you would sometimes expect. – JNK Apr 15 '11 at 14:38
  • 1
    Thanks! that solved the problem. I'm suprised i never ran into this issue before (guess i just kept my data cleaner than others programmers ;) – Limey Apr 15 '11 at 14:39
  • Is this just an issue with certain DBs? I'm currently working in SQL server, but i worked 10 years in Oracle and never ran across this problem. – Limey Apr 15 '11 at 14:48
  • @Limey - DBs handle `NULL` differently. Google `ANSI NULLS` in SQL for config options. – JNK Apr 15 '11 at 14:52
  • @Limey - Slight amend to my previous comment. `False` and `Unknown` is `False` of course! You might find this interesting http://www.simple-talk.com/sql/learn-sql-server/sql-and-the-snare-of-three-valued-logic/ – Martin Smith Apr 15 '11 at 14:55

2 Answers2

2

With your second query

select distinct id
from   db2.tran tr
where  id not in (select distinct id
                  from   db1.client) 

As the sub query brings back NULL values the query as a whole will always return an empty resultset for the reasons described in the comments. (See also SQL and the Snare of Three Valued Logic)

With your first query

select distinct ID
from   db2.tran tr
where  ID not in(select distinct id
                 from   db2.tran
                        join (select distinct id
                              from   db1.client) as c
                          on c.id = tr.id)

One effect of the JOIN condition c.id = tr.id will be to exclude all NULL values from the result of the sub query therefore this issue doesn't arise.

Martin Smith
  • 438,706
  • 87
  • 741
  • 845
0

we have seen very inconsistent results when using IN/NOT IN statements.

this should also be equivalent:

SELECT
distinct ID 
FROM db2.tran tr
where ID NOT IN (

   SELECT distinct a.id from db2.tran a join db1.client b on (a.id = b.id)

)
  • This isn't really an answer. FWIW the issue is `NULL` and `NOT IN` - `NOT IN` fails when it hits `NULL` since you can't say for certain that a value is `NOT IN NULL` as `NULL` is unknown. – JNK Apr 15 '11 at 14:51