0

Consider this snippet of SQL:

CREATE TABLE product 
(
    id integer, 
    stock_quantity integer
);

INSERT INTO product (id, stock_quantity) VALUES (1, NULL);
INSERT INTO product (id, stock_quantity) VALUES (2, NULL);

SELECT * 
FROM product 
WHERE (id, stock_quantity) NOT IN ((1, 2), (2, 9));

I can't understand why it doesn't select anything. I'm using Postgres.

I would expect both rows to be returned, because I'd expect (1, NULL) and (2, NULL) to not be in ((1,2), (2, 9)).

If I replace NULL with 0, for example, it does return the two results.

Why was it designed to be this way? What am I missing?

Thanks!

marc_s
  • 732,580
  • 175
  • 1,330
  • 1,459
OhMad
  • 6,871
  • 20
  • 56
  • 85
  • Removed the conflicting databases. Please add back the database you are actually using – RiggsFolly Nov 03 '22 at 18:25
  • please have a look https://stackoverflow.com/questions/129077/null-values-inside-not-in-clause – Sergey Nov 03 '22 at 18:26
  • NULL <> value results in unknown. WHERE unknown doesn't return the row. – jarlh Nov 03 '22 at 18:43
  • It doesn't find any row because it's not possible to determine if each row "is not (1, 2) or (2,9)". The rows have nulls (that are not values) and that means *missing* data. Those nulls could perfectly end up being `2` or `9`; we just don't know **yet**. Emphasis in "yet". – The Impaler Nov 03 '22 at 18:49
  • Please read about NULL handling in depth. My simple mnemonic rule to sort out nulls is to replace NULL with "I don't know". Then `is (1, "I don't know") in ((1, 2), (2, 9))`? I don't know. Is NOT "I don't know" true or false? I don't know, which coalesces to false. – Stefanov.sm Nov 03 '22 at 18:57

1 Answers1

1

Think of a null as missing data. For example, if we had a column "Date of Birth".

For example, consider a database that has three people born in 1975, 1990, and we didn't know the date of birth of the third one. We know that third person was born for sure, but we don't know it's birth date yet.

Now, what if a query searched for people "not born in 1990"? That would return the first person only.

The second person was born in 1990 so it clearly cannot be selected.

For the third person we don't know the date of birth so we cannot say anything about her, and the query doesn't select her either. Does it make sense?

The Impaler
  • 45,731
  • 9
  • 39
  • 76
  • I do still find it weird, to be honest, to treat NULL as a "placeholder", where anything "could" be, but you're not quite sure. I've not seen any other non-db language do this. – OhMad Nov 03 '22 at 19:39
  • @OhMad Welcome to relational modeling :) Nulls means a totally different thing compared to imperative languages like C, C#, Java, PHP, etc. where null means "nothing". – The Impaler Nov 03 '22 at 20:02