22

I was reading this article: Get null == null in SQL

And the consensus is that when trying to test equality between two (nullable) sql columns, the right approach is:

where ((A=B) OR (A IS NULL AND B IS NULL))

When A and B are NULL, (A=B) still returns FALSE, since NULL is not equal to NULL. That is why the extra check is required.

What about when testing inequalities? Following from the above discussion, it made me think that to test inequality I would need to do something like:

WHERE ((A <> B) OR (A IS NOT NULL AND B IS NULL) OR (A IS NULL AND B IS NOT NULL))

However, I noticed that that is not necessary (at least not on informix 11.5), and I can just do:

where (A<>B)

If A and B are NULL, this returns FALSE. If NULL is not equal to NULL, then shouldn't this return TRUE?

EDIT
These are all good answers, but I think my question was a little vague. Allow me to rephrase:

Given that either A or B can be NULL, is it enough to check their inequality with

where (A<>B)

Or do I need to explicitly check it like this:

WHERE ((A <> B) OR (A IS NOT NULL AND B IS NULL) OR (A IS NULL AND B IS NOT NULL))

REFER to this thread for the answer to this question.

Community
  • 1
  • 1
rouble
  • 16,364
  • 16
  • 107
  • 102

7 Answers7

36

Because that behavior follows established ternary logic where NULL is considered an unknown value.

If you think of NULL as unknown, it becomes much more intuitive:

Is unknown a equal to unknown b? There's no way to know, so: unknown.

Ben S
  • 68,394
  • 30
  • 171
  • 212
  • 3
    Great perspective! Another way to look at it is that if "unknown a" equals "unknown b", then they can't really be that unknown, since you know something about them. Kind of quantum mechanic-y. – womp Dec 02 '09 at 16:21
  • @womp: Indeed, you know that you don't know ;) – Adam Robinson Dec 02 '09 at 16:25
  • But don't we know that `NULL` does not equal `a`? Referring to `NULL` as 'unknown' implies `NULL` has a value that could be `a`. But it doesn't have a value and we know that the lack of a value does not equal the presence of a value. – Lukas Sep 20 '21 at 15:09
28

relational expressions involving NULL actually yield NULL again

edit

here, <> stands for arbitrary binary operator, NULL is the SQL placeholder, and value is any value (NULL is not a value):

  • NULL <> value -> NULL
  • NULL <> NULL -> NULL

the logic is: NULL means "no value" or "unknown value", and thus any comparison with any actual value makes no sense.

is X = 42 true, false, or unknown, given that you don't know what value (if any) X holds? SQL says it's unknown. is X = Y true, false, or unknown, given that both are unknown? SQL says the result is unknown. and it says so for any binary relational operation, which is only logical (even if having NULLs in the model is not in the first place).

SQL also provides two unary postfix operators, IS NULL and IS NOT NULL, these return TRUE or FALSE according to their operand.

  • NULL IS NULL -> TRUE
  • NULL IS NOT NULL -> FALSE
just somebody
  • 18,602
  • 6
  • 51
  • 60
  • Hey, after reading your explanation, I get a "compile error in my head". I mean, why is binary comparison even allowed then? If I write `select * from A where A.xxx <> null` I get an empty result set. However, if I write `select * from A where null` this is a syntactical error. So something is not working out for me with the "value <> null => null" explanation...can you help me how to think 'the right' way about this? – dingalapadum Mar 21 '17 at 10:27
  • might it be, that it would make more sense to think about it like `value <> null => false` and `value = null => false` too? This would avoid the type error I'm talking about. Moreover like this one can reason about what the meaning of something like `value <> null AND 1=1` is, whereas `null AND 1=1` again would be syntactically strange... as I said, I'm just trying to wrap my head around this. – dingalapadum Mar 21 '17 at 12:46
  • `foo OP NULL` is allowed because SQL is broken. – just somebody Mar 24 '17 at 16:28
  • the whole answer is about how that makes no sense at all and contradicts both the standard and actual semantics. for another take on this, answer the following question: "the thing in my hand is an apple". true or false? the only answer you can give is *unknown*. now answer this: "i have a thing in my hand". again, *you don't know*. – just somebody Mar 24 '17 at 16:28
  • "that" which makes no sense is of course thinking that both `foo != NULL` and `foo = NULL` are false. they're *unknown*. – just somebody Mar 24 '17 at 16:29
7

All comparisons involving null are undefined, and evaluate to false. This idea, which is what prevents null being evaluated as equivalent to null, also prevents null being evaluated as NOT equivalent to null.

Adam Robinson
  • 182,639
  • 35
  • 285
  • 343
  • 1
    Not sure who downvoted this. Any comparison involving `null`s other than `is` WILL return false on a DB that follows the SQL standard. – Donnie Dec 02 '09 at 16:17
  • 3
    I did. any comparison involving null will return NULL – just somebody Dec 02 '09 at 16:45
  • @just somebody: Yes, the "value" of the comparison is null, but since that "value" can never be truly inspected, its practical valuation is false. – Adam Robinson Dec 02 '09 at 16:48
  • It's so much easier to figure out what a given comparsion is going to do, though, if you know the underlying rule: Anything involving a null operand, apart from the special `IS NULL` and `IS NOT NULL` tests, produces NULL. – Jason Orendorff Dec 02 '09 at 17:00
  • @Jason: I fail to see how the statement that "a comparison involving one or more `null`s always evaluates to null" is more meaningful (let alone helpful) than "a comparison involving one or more `null`s always evaluates to false". You have to make the next step of "because it evaluates to null, it *does not evaluate to true*, making is--essentially--false". Please explain how this is clearer. – Adam Robinson Dec 02 '09 at 20:38
  • @Adam Robinson: You can inspect the value using PL/SQL in Oracle: `v_b:=null=null; if (v_b = false) then /* not reached */ elsif (v_b is null) then /* reached */ end if;` so even though `null = null` is not true, it is still not false either. – Mr. Shiny and New 安宇 Jun 07 '11 at 18:01
  • @Mr. Shiny: That's the point; the *comparison* is always evaluated to false. As you demonstrate, the `elsif` block is what is evaluated. You could reverse the comparison (`if (v_b = true)...`) and the relative code paths would be the same; the `elseif` block is what will be evaluated. – Adam Robinson Jun 07 '11 at 18:30
  • 1
    @Adam Robinson: The comparison evaluates to null. It doesn't evaluate to false. if it did, (v_b = false) would be true, which it isn't. The `if` branch is only followed if the result is `true`, so it _looks_ like the result is false, but it is definitely not false. – Mr. Shiny and New 安宇 Jun 07 '11 at 21:44
  • @Mr. Shiny: I think you might be getting confused by the phraseology; if that's the case, I apologize. As I've already stated, the *value* of the comparison is `null` (see above comment), but the practical *effect* is that the comparison itself evaluates to `false` (in other words, it will never be *true*). You are introducing another layer by using PL/SQL, as this is not SQL (it's a procedural language that is designed to interact with SQL). – Adam Robinson Jun 07 '11 at 22:16
  • 2
    @Adam Robinson: I'm just trying to clear up that while it is never true, it is also never false. The comparison evaluates to null (technically: to 'unknown'). It isn't false because if it were, you could negate it and get true, which you can't. The branch is never taken (or the row never matches) because those rely on values which are true. [Wikipedia explains it better than I do.](https://secure.wikimedia.org/wikipedia/en/wiki/Null_%28SQL%29). – Mr. Shiny and New 安宇 Jun 08 '11 at 12:59
  • @Mr.ShinyandNew安宇 I'm actually glad you pointed this out with pl/sql, it makes the reasoning behind what's going on much clearer here. – Reimius Mar 06 '13 at 22:57
4

The short answer is... NULLs are weird, they don't really behave like you'd expect.

Here's a great paper on how NULLs work in SQL. I think it will help improve your understanding of the topic. I think the sections on handling null values in expressions will be especially useful for you.

http://www.oracle.com/technology/oramag/oracle/05-jul/o45sql.html

Robert Greiner
  • 29,049
  • 9
  • 65
  • 85
3

The default (ANSI) behaviour of nulls within an expression will result in a null (there are enough other answers with the cases of that).

There are however some edge cases and caveats that I would place when dealing with MS Sql Server that are not being listed.

  • Nulls within a statement that is grouping values together will be considered equal and be grouped together.
  • Null values within a statement that is ordering them will be considered equal.
  • Null values selected within a statement that is using distinct will be considered equal when evaluating the distinct aspect of the query

It is possible in SQL Server to override the expression logic regarding the specific Null = Null test, using the SET ANSI_NULLS OFF, which will then give you equality between null values - this is not a recommended move, but does exist.

SET ANSI_NULLS OFF

select result =
    case
        when  null=null then 'eq' 
        else 'ne'
    end

SET ANSI_NULLS ON

select result =
    case
        when  null=null then 'eq' 
        else 'ne'
    end
Andrew
  • 26,629
  • 5
  • 63
  • 86
3

Here is a Quick Fix

ISNULL(A,0)=ISNULL(B,0)

0 can be changed to something that can never happen in your data

Alireza Rinan
  • 480
  • 6
  • 9
0

"Is unknown a equal to unknown b? There's no way to know, so: unknown."

The question was : why does the comparison yield FALSE ?

Given three-valued logic, it would indeed be sensible for the comparison to yield UNKNOWN (not FALSE). But SQL does yield FALSE, and not UNKNOWN.

One of the myriads of perversities in the SQL language.

Furthermore, the following must be taken into account :

If "unkown" is a logical value in ternary logic, then it ought to be the case that an equality comparison between two logical values that both happen to be (the value for) "unknown", then that comparison ought to yield TRUE.

If the logical value is itself unknown, then obviously that cannot be represented by putting the value "unknown" there, because that would imply that the logical value is known (to be "unknown"). That is, a.o., how relational theory proves that implementing 3-valued logic raises the requirement for a 4-valued logic, that a 4 valued logic leads to the need for a 5-valued logic, etc. etc. ad infinitum.

Erwin Smout
  • 18,113
  • 4
  • 33
  • 52