20

I was confused behind the reasoning of the following:

SELECT * FROM table WHERE avalue is null

Returns x number of rows where 'avalue' is null

SELECT * FROM table WHERE avalue <> true

Does not return rows where 'avalue' is null.

My reasoning (which appears to be incorrect) is that as null is a unique value (it isn't even equal to null) means that it should show in the result set as it isn't equal to true either.

I guess you could argue that by saying column <> value you imply that the column has a value therefore ignoring the null values altogether.

What is the reasoning behind this and is this the same in other common SQL DB's?

My reasoning (assumption) is telling me this is counter-intuitive and I wanted to learn why.

Erwin Brandstetter
  • 605,456
  • 145
  • 1,078
  • 1,228
Mr Shoubs
  • 14,629
  • 17
  • 68
  • 107
  • 8
    [Three-valued logic](http://en.wikipedia.org/wiki/Three-valued_logic). You're assuming that the only possible answers to `<>` are `true` or `false` - whereas in SQL, there is the option of `unknown`. This is in the standards. (Confusingly, MySQL conflates `UNKNOWN` and `NULL` so there, the possible results are `TRUE`, `FALSE` and `NULL`) – Damien_The_Unbeliever Jul 16 '13 at 14:52
  • 1
    SQL Server does it in the same way. – Fanda Jul 16 '13 at 15:03
  • As Damien is saying... This is commonly called three value logic. – Kuberchaun Jul 16 '13 at 15:21
  • `SELECT * FROM table WHERE avalue is not true` should include which 'avalue' is null – lfx_cool Feb 28 '19 at 07:20

3 Answers3

34

Every halfway decent RDBMS does it the same way, because it's correct. The manual:

Ordinary comparison operators yield null (signifying "unknown"), not true or false, when either input is null. For example, 7 = NULL yields null, as does 7 <> NULL. When this behavior is not suitable, use the IS [ NOT ] DISTINCT FROM constructs:

expression IS DISTINCT FROM expression
expression IS NOT DISTINCT FROM expression

These expressions perform slightly slower than simple expression <> expression comparison.

For boolean values there is also the simpler IS NOT [TRUE | FALSE].
To get what you expected in your second query, write:

SELECT * FROM tbl WHERE avalue IS NOT TRUE;

db<>fiddle here
Old sqlfiddle

Erwin Brandstetter
  • 605,456
  • 145
  • 1,078
  • 1,228
  • 3
    IMHO, this is flawed logic and not 'correct'. If I compare 7 to NULL, the answer should not be unknown, the answer should be false. This is SQL bending logic. Anyways, thanks for the great answer, the `IS DISTINCT FROM` saved the day. – sandre89 Jun 03 '17 at 19:35
  • 1
    @sandre89 The problem is that you cannot define an operator which combines or compares NULL (or "unknown") to anything else. What should be the result be for `7 > NULL` or `7 < NULL` or `7 + NULL`? Those operations do not make sense. Likewise `7 <> NULL` does not make sense, and the result is therefore unknown (which is represented as `NULL`). – Wizard of Ogz Jun 22 '17 at 21:27
  • 1
    @WizardofOgz I agree that 7 > NULL does not make sense. But 7 <> NULL is true and there's no way around it: it's different. The same way 7 = NULL should be false. – sandre89 Sep 20 '17 at 02:15
  • @sandre89: Except that's not true. `7 <> NULL` and `7 = NULL` are neither false nor true, but NULL. – Erwin Brandstetter Sep 20 '17 at 02:31
  • 1
    "Because" is a very bad answer. – Odalrick May 29 '18 at 11:29
  • if `NULL` means `unknown`, `7 > NULL` should be NULL, same `7 < NULL, 7 <> NULL, 7 = NULL`. if `NULL` means `empty`, `7 = NULL, 7 > NULL, 7 < NULL` should be `false`, `7 <> NULL` should be `true` – lfx_cool Feb 28 '19 at 06:47
6

This link provides a useful insight. Effectively as @Damien_The_Unbeliever points out, it uses Three-valued logic and seems to be (according to the article) the subject of debate.

A couple of other good links can be found here and here.

I think it boils down to null not being a value, but a place holder for a value and a decision had to be made and this was it... so NULL is not equal to any value because it isn't a value and won't even not be equal to any value.... if that makes sense.

Mr Shoubs
  • 14,629
  • 17
  • 68
  • 107
  • The article at simple-talk contains an error with the AND table. It states that "false AND unknown is unknown" when it should be "false AND unknown is false". Similarly with "unknown and false", it should be "false". This is discussed in the comments of the article but the article hasn't been updated. – Turing May 11 '17 at 14:57
2

It is normal. SQL Server does it in the same way. In SQL Server you can use

SELECT * FROM table WHERE ISNULL(avalue, 0) <> 1

For postgresql equivalent watch this: What is the PostgreSQL equivalent for ISNULL()

Consider to use NOT NULL column specification with default value, if it makes sense.

EDIT:

I think it is logic. NULL is not a value, so it is excluded from searching - you have to specify it explicitly. If SQL designers decides to go by second way (include nulls automatically), then you would get more troubles if you need to recognize no values

Community
  • 1
  • 1
Fanda
  • 3,760
  • 5
  • 37
  • 56