0

I am running the following queries in MS SQL Server (version 2016 if that matters).

SELECT DISTINCT Job_Number
FROM table_A
WHERE Job_Number IS NOT NULL

returns 2376 rows

SELECT *
FROM table_A
WHERE Job_Number IN (SELECT DISTINCT [House Bill] FROM table_B)

returns 137 rows

However,

SELECT *
FROM table_A
WHERE Job_Number NOT IN (SELECT DISTINCT [House Bill] FROM table_B)`

returns 0 rows. I would expect this would return 2239 rows (i.e. 2376 - 137).

Is there any fundamental concept of NOT IN that I have yet to understand here?

Gordon Linoff
  • 1,242,037
  • 58
  • 646
  • 786
cbasah
  • 508
  • 4
  • 10

4 Answers4

2

NOT IN will fail to return results if there are any NULL values returned by your following query: (SELECT DISTINCT [House Bill] FROM table_B)

My suggestion is to modify your query to:

(SELECT DISTINCT [House Bill] FROM table_B WHERE [House Bill] IS NOT NULL)

The reason this happens is because whenever a comparison is made between a row in table_A with NULL value in table_B the logical result is UNKNOWN.

There's more information about this:

1 - here and

2 - here

Radu Gheorghiu
  • 20,049
  • 16
  • 72
  • 107
0

The IN() clause ignores NULL values.

juergen d
  • 201,996
  • 37
  • 293
  • 362
0

The possible reason would be null values in tableb in [House Bill] column :

The correct way would be to use NOT EXISTS OR EXISTS instead of NOT IN AND IN clause :

SELECT a.*
FROM table_A a
WHERE NOT EXISTS (SELECT 1 FROM table_B b WHERE a.Job_Number = b.[House Bill]);
Yogesh Sharma
  • 49,870
  • 5
  • 26
  • 52
0

NOT IN returns no rows at all if the subquery returns NULL -- even once.

For that reason, I strongly recommend using NOT EXISTS. This behaves as you expect:

SELECT a.*
FROM table_A a
WHERE NOT EXISTS (SELECT 1 NFROM table_B b WHERE a.Job_Number = b.[House Bill])
Gordon Linoff
  • 1,242,037
  • 58
  • 646
  • 786