4

I have two tables

Table 1

Column1
_______
   1
   2
   3
   4
   5
   6

Table 2

Column 1
________
    4
 NULL    //This NULL value added after answering the question, to show the real problem
    5
    6
    7
    8
    9

This is an example case. When I tried,

SELECT column1 FROM Table1 WHERE column1 IN (SELECT column1 FROM Table2)

I got 4,5,6

WHEN

SELECT column1 FROM Table1 WHERE column1 NOT IN (SELECT column1 FROM Table2)

I didn't get 1,2,3 but NULL.

In real case the column1 of table1 is nvarchar(max) and column1 of table2 is varchar(50). However, I tried casting both into varchar(50).

Subin Jacob
  • 4,692
  • 10
  • 37
  • 69

3 Answers3

8

Check the documentation for IN, specifically:

Any null values returned by subquery or expression that are compared to test_expression using IN or NOT IN return UNKNOWN. Using null values in together with IN or NOT IN can produce unexpected results.

You haven't shown them, but I'm certain that you've got at least one NULL value lurking in your data.

You can exclude the NULL(s) and then the NOT IN will work as you expected:

SELECT column1 FROM Table1
WHERE column1 NOT IN (SELECT t2.column1 FROM Table2 t2
                      WHERE t2.column1 IS NOT NULL)

IN and NOT IN are, within a hand wave, opposites, but you have to keep SQL's three-valued logic in mind. Imagine we'd written the IN using the expression form

a IN (1,2,NULL)

Which is treated the same as:

a = 1 OR a = 2 or a = NULL

For any row where a = 1, we have:

TRUE OR TRUE OR UNKNOWN

which is TRUE. And for any row where a = 3, say, we have:

FALSE OR FALSE OR UNKNOWN

which is UNKNOWN

Now, consider NOT IN in the same way:

a NOT IN (1,2,NULL)

Which is treated the same as:

a != 1 AND a != 2 AND a != NULL

For any row where a = 1, we have:

FALSE AND TRUE AND UNKNOWN

Which is FALSE. And for a = 3, we have:

TRUE AND TRUE AND UNKNOWN

Which is UNKNOWN. The presence of the NULL means that there's no way to ever get this chain of ANDs to produce a TRUE value.

Damien_The_Unbeliever
  • 234,701
  • 27
  • 340
  • 448
6

That may happend if you have null values in your Table2. Use this query instead:

select *
from Table1 as t1
where not exists (select * from Table2 as t2 where t2.column1 = t1.column1);

sql fiddle demo

Test query:

-- Table2 doesn't have null values, works good
SELECT column1 FROM Table1 WHERE column1 IN (SELECT column1 FROM Table2);
SELECT column1 FROM Table1 WHERE column1 NOT IN (SELECT column1 FROM Table2);

insert into Table2
select null;

-- nothing returned by query, because of null values in Table2
SELECT column1 FROM Table1 WHERE column1 NOT IN (SELECT column1 FROM Table2);

-- works good
select *
from Table1 as t1
where not exists (select * from Table2 as t2 where t2.column1 = t1.column1);

This is happens because three-valued logic of SQL, see Damien_The_Unbeliever nice explanation. You can use not null query, like this:

SELECT column1 FROM Table1 WHERE column1 NOT IN (SELECT column1 FROM Table2 where column1 is not null);

But I like exists better, because it's filter out null implicitly (just because using =) condition.

As an addition, don't use query like ones in your question without aliases (actually not aliases, but dot notation, like Table.column or Alias.column), because you can have incorrect results. Always use dot notation for your columns. So your query should be something like:

SELECT t1.column1 FROM Table1 as t1 WHERE t1.column1 NOT IN (SELECT t2.column1 FROM Table2 as t2 where t2.column1 is not null);
Community
  • 1
  • 1
Roman Pekar
  • 107,110
  • 28
  • 195
  • 197
3

This is they main reason why you should avoid NOT IN, it's based on three-way logic, YES/NO/UNKNOWN :-)

col IN (1,2,NULL) 
is the logically equivalent to 
col=1 OR col=2 OR col=NULL 

col NOT IN (1,2,NULL) 
is the logically equivalent to
col<>1 AND col<>2 AND col<>NULL 

Now any comparison to NULL evaluates to UNKNOWN, only "col IS (NOT) NULL" is correct.

If you got ORed conditions an UNKNOWN doesn't matter, but a single UNKNOWN in ANDed conditions results in a final UNKNOWN.

When you add a NULL to the outer table1 and remove the NULL from table2 you'll notice that this row is missing in both IN/NOT IN answer sets.

The best workaround is using EXISTS/NOT EXISTS instead, there's only YES/NO because conditins evaluating to UNKNOWN are kind of ignored and simply treated as FALSE.

dnoeth
  • 59,503
  • 4
  • 39
  • 56
  • 1
    I swear I was editing my question and hadn't seen yours when I added my explanation using the expression form. I guess `(1,2,NULL)` is the quintessential example for explaining it. – Damien_The_Unbeliever Sep 18 '13 at 07:01
  • @Damien_The_Unbeliever: no problemo, great minds think alike :-) And your explanation is more detailed. – dnoeth Sep 18 '13 at 07:24