26

I'm not asking if it does. I know that it doesn't.

I'm curious as to the reason. I've read support docs such as as this one on Working With Nulls in MySQL but they don't really give any reason. They only repeat the mantra that you have to use "is null" instead.

This has always bothered me. When doing dynamic SQL (those rare times when it has to be done) it would be so much easier to pass "null" into where clause like this:

@where = "where GroupId = null"

Which would be a simple replacement for a regular variable. Instead we have to use if/else blocks to do stuff like:

if @groupId is null then
     @where = "where GroupId is null"
else
     @where = "where GroupId = @groupId"
end

In larger more-complicated queries, this is a huge pain in the neck. Is there a specific reason that SQL and all the major RDBMS vendors don't allow this? Some kind of keyword conflict or value conflict that it would create?

Edit:

The problem with a lot of the answers (in my opinion) is that everyone is setting up an equivalency between null and "I don't know what the value is". There's a huge difference between those two things. If null meant "there's a value but it's unknown" I would 100% agree that nulls couldn't be equal. But SQL null doesn't mean that. It means that there is no value. Any two SQL results that are null both have no value. No value does not equal unknown value. Two different things. That's an important distinction.

Edit 2:

The other problem I have is that other HLLs allow null=null perfectly fine and resolve it appropriately. In C# for instance, null=null returns true.

sohtimsso1970
  • 3,216
  • 4
  • 28
  • 38
  • Relevant Wikipedia article [Null (SQL)](http://en.wikipedia.org/wiki/Null_(SQL)#Three-valued_logic_.283VL.29) – dee-see Aug 16 '11 at 13:24
  • 4
    Instead of this clause, it's perfectly acceptable to make your where clause something like "where GroupId is null OR GroupId = @groupId"... depending on your RDBMS, certain phrasing will be more optimal than others. – Jeremy Holovacs Aug 16 '11 at 13:26
  • Good point, Jeremy. And I've done that in a lot of queries. I forgot about OR-ing the other state in there. Still, I find it frustrating that I have to. – sohtimsso1970 Aug 16 '11 at 13:35
  • 1
    possible duplicate of [Why does NULL = NULL evaluate to false in SQL server](http://stackoverflow.com/questions/1843451/why-does-null-null-evaluate-to-false-in-sql-server) – JNK Aug 16 '11 at 13:45
  • 5
    NULL is a placeholder for a value that is missing. Why that value is missing is semantics which are provided by your data model and/or your application, not the SQL language. This includes your "No value" interpretation as well as everyone else's. As for the history, [Ted Codd](http://en.wikipedia.org/wiki/Edgar_F._Codd) proposed it and SQL implemented it. Codd went on to propose a second kind of NULL but everyone ignored him that time. – onedaywhen Aug 16 '11 at 14:12

12 Answers12

26

The reason why it's off by default is that null is really not equal to null in a business sense. For example, if you were joining orders and customers:

select * from orders o join customers c on c.name = o.customer_name

It wouldn't make a lot of sense to match orders with an unknown customer with customers with an unknown name.

Most databases allow you to customize this behaviour. For example, in SQL Server:

set ansi_nulls on
if null = null  
    print 'this will not print' 
set ansi_nulls off
if null = null  
    print 'this should print'
Andomar
  • 232,371
  • 49
  • 380
  • 404
10

Equality is something that can be absolutely determined. The trouble with null is that it's inherently unknown. If you follow the truth table for three-value logic, null combined with any other value is null - unknown. Asking SQL "Is my value equal to null?" would be unknown every single time, even if the input is null. I think the implementation of IS NULL makes it clear.

Yuck
  • 49,664
  • 13
  • 105
  • 135
  • See, I don't agree with that - the truth tables I mean. Why can't SQL determine that the comparison value is null and return a true? I don't understand why that must be the case. In JavaScript, NaN doesn't equal other NaN variables because you could have two different values both of which are NaN (such as "abc" and "def"). But here, with SQL null, the unknown is always the same. – sohtimsso1970 Aug 16 '11 at 13:30
  • This has its roots in First-Order Logic and Set Theory. I'd say it was carried into ANSI SQL for consistency. – Yuck Aug 16 '11 at 13:33
  • @HLGEM: Of course it can. As others have pointed out, in SQL Server, you can turn off ANSI_NULL which makes NULL=NULL return true. (So your last sentence is incorrect) – sohtimsso1970 Aug 16 '11 at 13:46
  • Turning off `ANSI_NULL` is a product feature. It has no place in the logic theory that created the concept of `null` in the first place. Furthermore, taking advantage of turning off `ANSI_NULL` can be very, very confusing and lead to SQL scripts that are hard to debug or have unexpected results. – Yuck Aug 16 '11 at 13:55
  • Thinking through this more, my main problem is the equivalency everyone is making between **no** value and **unknown** value. Those are two very different things. Two different SQL results that are both null mean that both of them have **no** value. The value isn't merely unknown, it doesn't exist. – sohtimsso1970 Aug 16 '11 at 13:59
  • 2
    `null` can actually mean "not applicable", "unknown", or "value not present". Those are not equivalent situations. – Yuck Aug 16 '11 at 14:19
  • @HLGEM: My "wants" have nothing to do with it. Stick to the debate at hand. As far as math goes, you're wrong. Null does not mean "unknown", it means "no value". Furthermore, to suggest that null=null is mathematically incorrect is to suggest that every other HLL that accepts that (C#, C++, VB, PHP, etc) is also mathematically incorrect. As one looks across the programming world, one notices that null=null is accepted by the **majority** of HLLs. SQL is in the minority. – sohtimsso1970 Aug 16 '11 at 14:34
  • 3
    @Sohtimsso1970 - you're ascribing a meaning to four letters (NULL) that doesn't have a specific, inherent meaning. It no more *means* **no value** than it means **unknown**. However, in the SQL language, it is defined to be a placeholder for missing or unknown information. What it means in other languages has no relevance when asking a question about SQL. – Damien_The_Unbeliever Aug 16 '11 at 14:40
  • 1
    @Damien_The_Unbeliever: Look at [the Wikipedia page](http://en.wikipedia.org/wiki/Null_(SQL)) for SQL null: "indicates... **does not exist**". Other HLL environments use the same definition. To me, there's a clear difference between a value being unknown and a value not existing. If the value was merely unknown, I'd agree that you couldn't equate the two. The known absence of a value should be comparable though. – sohtimsso1970 Aug 16 '11 at 14:44
  • 5
    @Sohtimsso1970 - "that a data value does not exist *in the database*" - it's not an assertion that the value doesn't necessarily exist for the item being modelled, just that it is "missing information [or] inapplicable information". – Damien_The_Unbeliever Aug 16 '11 at 14:50
6

It's a language semantic.

Null is the lack of a value.

is null makes sense to me. It says, "is lacking a value" or "is unknown". Personally I've never asked somebody if something is, "equal to lacking a value".

Justin Niessner
  • 242,243
  • 40
  • 408
  • 536
  • 1
    so if it's just a language semantic, why not let the language parser interpret `= null` the same way as `is null` – kenwarner Aug 16 '11 at 13:28
  • And in fact, MySQL, according to the part of the documentation linked to by the original poster, just plays with you by replying “NULL” if you ask if anything “is equal to NULL” :-) – Arthur Reutenauer Aug 16 '11 at 13:31
  • 3
    Agreed - "=null" seems to imply that two non-existent things are equal... ie., `lochNessMonster=bigFoot` , just because neither of them exist (at least, I think not ;-) ) – Mike C Aug 16 '11 at 13:31
5

I can't help but feel that you're still not satisfied with the answers that have been given so far, so I thought I'd try another tack. Let's have an example (no, I've no idea why this specific example has come into my head).

We have a table for employees, EMP:

EMP
---
EMPNO           GIVENNAME
E0001           Boris
E0002           Chris
E0003           Dave
E0004           Steve
E0005           Tony

And, for whatever bizarre reason, we're tracking what colour trousers each employee chooses to wear on a particular day (TROUS):

TROUS
-----
EMPNO       DATE        COLOUR
E0001       20110806    Brown
E0002       20110806    Blue
E0003       20110806    Black
E0004       20110806    Brown
E0005       20110806    Black
E0001       20110807    Black
E0003       20110807    Black
E0004       20110807    Grey

I could go on. We write a query, where we want to know the name of every employee, and what colour trousers they had on on the 7th August:

SELECT e.GIVENNAME,t.COLOUR
FROM
    EMP e
        LEFT JOIN
    TROUS t
        ON
             e.EMPNO = t.EMPNO and
             t.DATE = '20110807'

And we get the result set:

GIVENNAME       COLOUR
Chris           NULL
Steve           Grey
Dave            Black
Boris           Black
Tony            NULL

Now, this result set could be in a view, or CTE, or whatever, and we might want to continue asking questions about these results, using SQL. What might some of these questions be?

  1. Were Dave and Boris wearing the same colour trousers on that day? (Yes, Black==Black)

  2. Were Dave and Steve wearing the same colour trousers on that day? (No, Black!=Grey)

  3. Were Boris and Tony wearing the same colour trousers on that day? (Unknown - we're trying to compare with NULL, and we're following the SQL rules)

  4. Were Boris and Tony not wearing the same colour trousers on that day? (Unknown - we're again comparing to NULL, and we're following SQL rules)

  5. Were Chris and Tony wearing the same colour trousers on that day? (Unknown)

Note, that you're already aware of specific mechanisms (e.g. IS NULL) to force the outcomes you want, if you've designed your database to never use NULL as a marker for missing information.

But in SQL, NULL has been given two roles (at least) - to mark inapplicable information (maybe we have complete information in the database, and Chris and Tony didn't turn up for work that day, or did but weren't wearing trousers), and to mark missing information (Chris did turn up that day, we just don't have the information recorded in the database at this time)

If you're using NULL purely as a marker of inapplicable information, I assume you're avoiding such constructs as outer joins.


I find it interesting that you've brought up NaN in comments to other answers, without seeing that NaN and (SQL) NULL have a lot in common. The biggest difference between them is that NULL is intended for use across the system, no matter what data type is involved.

You're biggest issue seems to be that you've decided that NULL has a single meaning across all programming languages, and you seem to feel that SQL has broken that meaning. In fact, null in different languages frequently has subtly different meanings. In some languages, it's a synonym for 0. In others, not, so the comparison 0==null will succeed in some, and fail in others. You mentioned VB, but VB (assuming you're talking .NET versions) does not have null. It has Nothing, which again is subtly different (it's the equivalent in most respects of the C# construct default(T)).

Damien_The_Unbeliever
  • 234,701
  • 27
  • 340
  • 448
  • In what language does null=0? Never seen that before. I keep returning to the definition of null being "does not exist" (which is the same as saying "nothing" in VB) because that's what all the definitions say. I shared the Wikipedia link earlier. There are plenty of other sources. The only places I see where null=unknown is in discussions such as this where commenters insist that it means that. In the same Wikipedia page, there's a section specifically talking about "unknown" comparisons and it says that that definition is in conflict with the original spec. – sohtimsso1970 Aug 17 '11 at 12:32
  • I don't personally "care" or "want" it to be one way or another. I've just noticed, objectively, that a) almost all other programming languages allow null=null and b) that SQL defines null as "does not exist" which is in keeping with the same principal. I guess my beef is more with the assertion that null=unknown. – sohtimsso1970 Aug 17 '11 at 12:34
  • Thanks for the added information here. +1 for that. Hope I'm not coming across needlessly argumentative. I appreciate everyone talking about it. – sohtimsso1970 Aug 17 '11 at 12:36
3

The concept is that NULL is not an equitable value. It denotes the absence of a value.

Therefore, a variable or a column can only be checked if it IS NULL, but not if it IS EQUAL TO NULL.

Once you open up arithmetic comparisions, you may have to contend with IS GREATER THAN NULL, or IS LESS THAN OR EQUAL TO NULL

Raj More
  • 47,048
  • 33
  • 131
  • 198
3

NULL is unknown. It is neither true nor false so when you are comparing anything to unknown, the only answer is "unknown" Much better article on wikipedia http://en.wikipedia.org/wiki/Null_(SQL)

billinkc
  • 59,250
  • 9
  • 102
  • 159
2

Because in ANSI SQL, null means "unknown", which is not a value. As such, it doesn't equal anything; you can just evaluate the value's state (known or unknown).

Jeremy Holovacs
  • 22,480
  • 33
  • 117
  • 254
  • 1
    Why can't "not a value" equal "not a value" when doing a comparison? In JavaScript, NaN doesn't equal other NaN variables because you could have two different variable values both of which are NaN. But here, with SQL null, the unknown is always the same. – sohtimsso1970 Aug 16 '11 at 13:33
  • 2
    It's an ANSI SQL definition. In SQL Server, if you shut off ANSI_NULLS option (NULL = NULL) is true ... otherwise, (NULL = NULL) is NULL. – Jeremy Holovacs Aug 16 '11 at 13:35
  • @sohtimsso1970: NaN means “not a number”. It has no specified value, and the fact that you could interpret it as having some specific value in a particular implementation is irrelevant. The same applies to NULL in SQL. – Arthur Reutenauer Aug 16 '11 at 13:38
  • @Arthur: Not exactly. In SQL, null is null is null is null. NaN is different in that the underlying value can be many different things. There is no difference between one SQL result that is null and another. – sohtimsso1970 Aug 16 '11 at 13:43
  • I made an edit on the OP to try and clarify my position. I think the problem is trying to say that **no** value is the same as **unknown** value. (My point about NaN is that you can have two variables that are different, say "abc" and "def", which are both NaN but are not equal) – sohtimsso1970 Aug 16 '11 at 14:01
  • @sohtimsso1970 And this is exactly the case with SQL `NULL` too. `NULL` can mean "unknown", not only "no value". If two values are unknown, then you don't know if they're equal. That is, their equality is unknown. Or in SQL, `NULL = NULL` is `NULL`. – Marnen Laibow-Koser May 29 '14 at 16:46
2

a. Null is not the "lack of a value"

b. Null is not "empty"

c. Null is not an "unset value"

It's all of the above and none of the above.

By technical rights, NULL is an "unknown value". However, like uninitialized pointers in C/C++, you don't really know what your pointing at. With databases, they allocate the space but do not initialize the value in that space.

So, it is an "empty" space in the sense that it's not initialized. If you set a value to NULL, the original value stays in that storage location. If it was originally an empty string (for example), it will remain that.

It's a "lack of a value" in the fact that it hasn't been set to what the database deems a valid value.

It's an "unset value" in that if the space was just allocated, the value that is there has never been set.

"Unknown" is the closest that we can truly come to knowing what to expect when we examine a NULL.


Because of that, if we try to compare this "unknown" value, we will get a comparison that

a) may or may not be valid

b) may or may not have the result we expect

c) may or may not crash the database.

So, the DBMS systems (long ago) decided that it doesn't even make sense to use equality when it comes to NULL.

Therefore, "= null" makes no sense.

Richard
  • 6,215
  • 4
  • 33
  • 48
1

I agree with the OP that

where column_name = null

should be syntactic sugar for

where column_name is null

However, I do understand why the creators of SQL wanted to make the distinction. In three-valued logic (IMO this is a misnomer), a predicate can return two values (true or false) OR unknown which is technically not a value but just a way to say "we don't know which of the two values this is". Think about the following predicate in terms of three-valued logic:

A == B

This predicate tests whether A is equal to B. Here's what the truth table looks like:

    T U F
    -----
T | T U F
U | U U U
F | F U T

If either A or B is unknown, the predicate itself always returns unknown, regardless of whether the other one is true or false or unknown.

In SQL, null is a synonym for unknown. So, the SQL predicate

column_name = null

tests whether the value of column_name is equal to something whose value is unknown, and returns unknown regardless of whether column_name is true or false or unknown or anything else, just like in three-valued logic above. SQL DML operations are restricted to operating on rows for which the predicate in the where clause returns true, ignoring rows for which the predicate returns false or unknown. That's why "where column_name = null" doesn't operate on any rows.

Niko Bellic
  • 2,370
  • 2
  • 29
  • 25
1

In addition to all that has already been said, I wish to stress that what you write in your first line is wrong. SQL does support the “= NULL” syntax, but it has a different semantic than “IS NULL” – as can be seen in the very piece of documentation you linked to.

Arthur Reutenauer
  • 2,622
  • 1
  • 17
  • 15
0

Why don't you use the isnull function?

@where = "where GroupId = "+ isnull(@groupId,"null")
0

NULL doesn't equal NULL. It can't equal NULL. It doesn't make sense for them to be equal.

A few ways to think about it:

  1. Imagine a contacts database, containing fields like FirstName, LastName, DateOfBirth and HairColor. If I looked for records WHERE DateOfBirth = HairColor, should it ever match anything? What if someone's DateOfBirth was NULL, and their HairColor was too? An unknown hair color isn't equal to an unknown anything else.

  2. Let's join the contacts table with purchases and product tables. Let's say I want to find all the instances where a customer bought a wig that was the same color as their own hair. So I query WHERE contacts.HairColor = product.WigColor. Should I get matches between every customer I don't know the hair color of and products that don't have a WigColor? No, they're a different thing.

  3. Let's consider that NULL is another word for unknown. What's the result of ('Smith' = NULL)? The answer is not false, it's unknown. Unknown is not true, therefore it behaves like false. What's the result of (NULL = NULL)? The answer is also unknown, therefore also effectively false. (This is also why concatenating a string with a NULL value makes the whole string become NULL -- the result really is unknown.)

Simon Wright
  • 31
  • 1
  • 3