0

In my database I have a table which holds all Files

This table has several dependencies (Let's name them: FilesDep1, FilesDep2, FilesDep3, FilesDep4, FilesDep5)

The files table holds over 20000 records and I need to filter out the files that are not used by any of the five dependencies.

So I used union on all FileId's from the FilesDep1

as

select Id from [Files] where Id not in
(
    select FileId from [FilesDep1]
    union 
    select FileId from [FilesDep2]
    union 
    select FileId from [FilesDep3]
    union 
    select FileId from [FilesDep4]
    union 
    select FileId from [FilesDep5]
)

The amount given by all the unions is 1997. So I expect to get 18000+ records from this query, However... it returns 0?

I would like to know what causes this behaviour?

If I change the not in to in, it does show the 1997 records given by the unionquery...

ps. Please do not respond to the naming of tables, or the fact that i'm using union for this query instead of inner joins or something else. This question is about why the union query does not work as expected.

Theun Arbeider
  • 5,259
  • 11
  • 45
  • 68
  • 5
    `NOT IN ... (NULL)` always returns no rows irrespective of what `NOT NULL` values may also be returned. – Martin Smith Jun 12 '13 at 10:25
  • -1 Because of the *"Please do not respond to ... or something else"* comment; you've effectively ruled out any response that answers your question "why is this happening". As Martin observed, this is almost certainly due to your `not in` subquery including a null, which is why this sort of query is best handled using either a `not exists` or `left join...where null` condition. –  Jun 12 '13 at 10:33
  • @MArk: I don't see your point. Since I know a lot of people will start complaining about that this is not the proper way or best practice. I'm aware of that, thus I added to the question that I am aware of that and that i'm only trying to find out what caused this behavior and am not interested in people telling me it's bad practice or bad programming. – Theun Arbeider Jun 12 '13 at 11:09

1 Answers1

1

You probably have NULL values for FieleID somewhere? NOT IN will not work as expected if a single NULL value exists in result of your sub-query.

You can either handle NULLs

select Id from [Files] where Id not in
(
    select FileId from [FilesDep1] where FileID is NOT NULL
    union 
    select FileId from [FilesDep2] where FileID is NOT NULL
    union 
    select FileId from [FilesDep3] where FileID is NOT NULL
    union 
    select FileId from [FilesDep4] where FileID is NOT NULL
    union 
    select FileId from [FilesDep5] where FileID is NOT NULL
)

or replace the NOT IN with NOT EXISTS

select f.Id from [Files] f where NOT EXISTS 
(SELECT * FROM 
(
    select FileId from [FilesDep1]
    union 
    select FileId from [FilesDep2]
    union 
    select FileId from [FilesDep3]
    union 
    select FileId from [FilesDep4]
    union 
    select FileId from [FilesDep5]
) x 
WHERE x.FileID = f.ID)
Nenad Zivkovic
  • 18,221
  • 6
  • 42
  • 55