0

There are two tables, both have only one column. delete_DimCSI has 196194 rows and delete_FRO has 195740 rows. I am looking for rows that exist in delete_DimCSI table but not existing in delete_FRO table.

This is the query I usually use and it worked so far:

select PK_CSI from delete_DimCSI
where PK_CSI not in (select FK_CSI from delete_FRO)

It returns 0 rows.

Then I created this one:

select PK_CSI, FK_CSI from delete_DimCSI
LEFT OUTER JOIN delete_FRO FRO on FK_CSI = PK_CSI
where FK_CSI is null

It returns 455 rows.

Any idea why the first query doesn't return any row?

Alberto Solano
  • 7,972
  • 3
  • 38
  • 61
ilija veselica
  • 9,414
  • 39
  • 93
  • 147
  • 1
    I guess because the second query joins both tables on that column and select all where the foreign-key is null or there is no row with that `PK_CSI`. The first query uses `NOT IN` which ignores NULL values. Instead i prefer `NOT EXISTS`. http://sqlperformance.com/2012/12/t-sql-queries/left-anti-semi-join – Tim Schmelter Jan 13 '15 at 13:18
  • possible duplicate of [NOT IN clause and NULL values](http://stackoverflow.com/questions/129077/not-in-clause-and-null-values) – Pரதீப் Jan 13 '15 at 13:23

2 Answers2

3

You probably have NULLs in FK_csi

select PK_CSI from delete_DimCSI
where PK_CSI not in (select FK_CSI from delete_FRO WHERE FK_CSI IS NOT NULL)

Look here for an explanation.

Community
  • 1
  • 1
Mihai
  • 26,325
  • 7
  • 66
  • 81
1

NOT IN behaves strangely when the subquery has NULL values. In that case, it returns either NULL or false -- neither of which is treated as true.

For this reason, I recommend using NOT EXISTS rather than NOT IN. The semantics are more what you expect:

select PK_CSI
from delete_DimCSI d
where not exists (select 1 from delete_FRO f where f.FK_CSI = d.PK_CSI);

Of course, you can also add an explicit wheref.FK_CSI = NULL` as Mihai suggests.

Gordon Linoff
  • 1,242,037
  • 58
  • 646
  • 786