76

In .NET there is the null reference, which is used everywhere to denote that an object reference is empty, and then there is the DBNull, which is used by database drivers (and few others) to denote... pretty much the same thing. Naturally, this creates a lot of confusion and conversion routines have to be churned out, etc.

Why did the original .NET authors decide to make this? To me it makes no sense. Their documentation makes no sense either:

The DBNull class represents a nonexistent value. In a database, for example, a column in a row of a table might not contain any data whatsoever. That is, the column is considered to not exist at all instead of merely not having a value. A DBNull object represents the nonexistent column. Additionally, COM interop uses the DBNull class to distinguish between a VT_NULL variant, which indicates a nonexistent value, and a VT_EMPTY variant, which indicates an unspecified value.

What's this about a "column not existing"? A column exists, it just doesn't have a value for the particular row. If it didn't exist, I'd get an exception trying to access the specific cell, not a DBNull! I can understand the need to differentiate between VT_NULL and VT_EMPTY, but then why not make a COMEmpty class instead? That would be a much neater fit in the whole .NET framework.

Am I missing something? Can anyone shed some light why DBNull was invented and what problems it helps to solve?

TylerH
  • 20,799
  • 66
  • 75
  • 101
Vilx-
  • 104,512
  • 87
  • 279
  • 422
  • 4
    Just another data point... the DBI (DataBase Interface) module in the Perl language does not have a concept of DbNull. When a value is NULL in the database, DBI represents it as the Perl "undef" which is the Perl equivalent of "null" in C#. So Perl has taken the position that a special "DbNull" concept is not necessary and I haven't heard of any Perl programmers wishing they had DbNull. – JoelFan Oct 12 '11 at 14:17
  • 1
    I'm with you, @JoelFan - I see no genuine use in it either – Marc Gravell Mar 09 '12 at 12:55
  • Besides the answers below, the comment by below by @thomas-levesque is very important: `DBNull` predates the introduction in the .NET framework of genuine nullable types. Both their behaviour is slightly different, so `DBNull` had to stay (to my regret, but that is a different story). – Jeroen Wiert Pluimers Mar 09 '13 at 13:24
  • 3
    I like the question, but have to say the community is inconsistent. Many questions about "why is it this way?" get closed down as "not a question", especially if the asker permits himself to explain why something appears to make no sense. (I tried asking/complaining about the lack of proper support for abstract types when exposing WCF services as web services, and got shot down immediately!) – The Dag May 02 '13 at 17:01

6 Answers6

187

I'm going to disagree with the trend here. I'll go on record:

I do not agree that DBNull serves any useful purpose; it adds unnecessary confusion, while contributing virtually no value.

The argument is often put forward that null is an invalid reference, and that DBNull is a null object pattern; neither is true. For example:

int? x = null;

this is not an "invalid reference"; it is a null value. Indeed null means whatever you want it to mean, and frankly I have absolutely no problem working with values that may be null (indeed, even in SQL we need to correctly work with null - nothing changes here). Equally, the "null object pattern" only makes sense if you are actually treating it as an object in OOP terms, but if we have a value that can be "our value, or a DBNull" then it must be object, so we can't be doing anything useful with it.

There are so many bad things with DBNull:

  • it forces you to work with object, since only object can hold DBNull or another value
  • there is no real difference between "could be a value or DBNull" vs "could be a value or null"
  • the argument that it stems from 1.1 (pre-nullable-types) is meaningless; we could use null perfectly well in 1.1
  • most APIs have "is it null?" methods, for example DBDataReader.IsDBNull or DataRow.IsNull - neither of which actually require DBNull to exist in terms of the API
  • DBNull fails in terms of null-coalescing; value ?? defaultValue doesn't work if the value is DBNull
  • DBNull.Value can't be used in optional parameters, since it isn't a constant
  • the runtime semantics of DBNull are identical to the semantics of null; in particular, DBNull actually equals DBNull - so it does not do the job of representing the SQL semantic
  • it often forces value-type values to be boxed since it over-uses object
  • if you need to test for DBNull, you might as well have tested for just null
  • it causes huge problems for things like command-parameters, with a very silly behaviour that if a parameter has a null value it isn't sent... well, here's an idea: if you don't want a parameter sent - don't add it to the parameters collection
  • every ORM I can think of works perfectly well without any need or use of DBNull, except as an extra nuisance when talking to the ADO.NET code

The only even remotely compelling argument I've ever seen to justify the existence of such a value is in DataTable, when passing in values to create a new row; a null means "use the default", a DBNull is explicitly a null - frankly this API could have had a specific treatment for this case - an imaginary DataRow.DefaultValue for example would be much better than introducing a DBNull.Value that infects vast swathes of code for no reason.

Equally, the ExecuteScalar scenario is... tenuous at best; if you are executing a scalar method, you expect a result. In the scenario where there are no rows, returning null doesn't seem too terrible. If you absolutely need to disambiguate between "no rows" and "one single null returned", there's the reader API.

This ship has sailed long ago, and it is far far too late to fix it. But! Please do not think that everyone agrees that this is an "obvious" thing. Many developers do not see value in this odd wrinkle on the BCL.

I actually wonder if all of this stems from two things:

  • having to use the word Nothing instead of something involving "null" in VB
  • being able to us the if(value is DBNull) syntax which "looks just like SQL", rather than the oh-so-tricky if(value==null)

Summary:

Having 3 options (null, DBNull, or an actual value) is only useful if there is a genuine example where you need to disambiguate between 3 different cases. I have yet to see an occasion where I need to represent two different "null" states, so DBNull is entirely redundant given that null already exists and has much better language and runtime support.

Marc Gravell
  • 1,026,079
  • 266
  • 2,566
  • 2,900
  • 19
    +1, I've never had a situation where I actually wanted DBNull.Value to be differentiated from null. I've always felt this to be pain point, and seen (+ experienced) a lot of time wasted due to the fact null != DBNull.Value – AdaTheDev Mar 09 '12 at 10:05
  • Is there a typo at "DBNull actually equals DBNull", if not - I don't get it :). Shouldn't it be "DBNull actually equals null"? – Alex Mar 09 '12 at 12:34
  • 6
    @Alex no, no typo; and `DBNull` definitely does not equal null; try `bool x = DBNull.Value.Equals(DBNull.Value); bool y = DBNull.Value.Equals(null);`. The point I was trying to make here is that if you try that in an ANSI compliant SQL database you'll find that `null` does not equal `null` (but at the same time, `null` does not "not equal" `null`) – Marc Gravell Mar 09 '12 at 12:42
  • Thanks for the heads up, I was just wondering if there is a case where DBNull is NOT DBNull (In the context of ADO.NET). Got me slightly confused. – Alex Mar 09 '12 at 12:48
  • (moved from http://stackoverflow.com/questions/4958379/what-is-the-difference-between-null-and-system-dbnull-value, by the way - seemed a better fit) – Marc Gravell Mar 09 '12 at 12:52
  • I think the `Nothing` in VB.Net is close to the mark. Vb.Net's `Nothing` is not a direct equivalent to C#'s `null` (it's a closer match for `default(T)`) and therefore cannot fill for VB.Net every function that you suggest `null` would better accomplish vs DBNull.Value in C#. But, yeah: DBNull.Value sucks. I'd like to see the CLR team implement some kind of "auto conversion", so if you pass null to ado.net api methods where only DBNull.Value is valid (or set it as a parameter value, for example), it does the conversion for you. – Joel Coehoorn Mar 09 '12 at 14:47
  • I find your text really confusing when you keep talking about 'null values' (has null ever been a value? My brains keeps saying no ...) and DBNull 'values': DBNull is a data type. So writting that an object can hold a DBNull value is totally ununderstandable for me! – Philippe Grondier Dec 28 '12 at 15:38
  • @Philippe a variable has a value. The value of a variable can be "null" (for reference-types, including `object`, and for nullable--of-T). Yes, null is a value. In the case of reference-type variables, it is the all-zero reference (although technically many low-value non-zero references will also be interpreted as null in some scenarios, due to an implementation detail). An `object` variable can hold a reference to an instance of `DBNull`; that last, perhaps, I did abbreviate - but I don't think the meaning is lost. – Marc Gravell Dec 28 '12 at 22:20
  • @Marc thanks for your answer. In my idea (and I think it's quite a common one when we talk maths), the concept of value does not exist outside the basic concept of comparison. As nulls cannot be compared, nor they can be ordered, added, etc, they cannot be called 'values'. In fact, Null is a representation of a lack of value. This said, this concept is not rigourously implemented in computer languages, where approximations like (comparison)"if my_x = null" or (assignment)"x = null" are common. So we could say that, when it comes to computer science, null is a value ... but I don't like it! – Philippe Grondier Dec 29 '12 at 10:27
  • @Philippe I'm talking about programming, so the "value of null" is simply the sequence of bits that represents the concept of null, which incidentally in the context of object is simply "no object". The term "null" is overloaded, and this is not the math usage. Although actually the concept if null doesn't actually come up in the majority of maths. – Marc Gravell Dec 29 '12 at 17:57
  • Might it be that DBNull was introduced to have a CLR type with the same *semantics* as database null values? Recall that null is unequal to anything, including null, in relational databases, so that for instance selecting something where Col1 = Col2 does not match where both columns "have no value". Null is indeed meant to represent "no value", not to serve as a "magic value". To talk about which bit pattern is used to represent the concept of "does not have a value" is to completely miss the point. – The Dag May 02 '13 at 16:14
  • 1
    @Dag I commented on that: if that was the intent it fails, as discussed in my answer – Marc Gravell May 02 '13 at 16:48
  • @Marc I see now that you did, on the semantics. But the second point stands (parameters with default values). Here we have a situation where it may be "my value" or DBNull or CLR null, the latter meaning "use the default value". Of course, the latter could be achieved in other ways (e.g. omitting the parameter), but I don't think that would be logical. Not saying DBNull is a good thing; I'm just not convinced it's a bad thing. – The Dag May 02 '13 at 16:57
  • Check for IsDBNull is just something I have come to accept. An outer join can produce an absence of a value. Might be some optimization of not producing a null value. I am probably giving Reader too much credit but it might be more efficient to return a IsDBNull than return a null. – paparazzo Sep 03 '13 at 17:14
  • 1
    @Blam no, it is not more efficient to use a dbnull rather than a null. And yes, some joins etc leave an absence of data - a null: but that doesn't dictate null vs dbnull – Marc Gravell Sep 03 '13 at 18:03
  • It makes no sense to assert that `DBNull` is redundant with `null` because [null is evil](http://code.google.com/p/guava-libraries/wiki/UsingAndAvoidingNullExplained) and a design flaw. Granted, being forced to use `object` is also a design flaw but it's incorrect to say that only `object` could hold `DBNull` or another value - [a sum type would do the trick](http://bugsquash.blogspot.com/2012/01/encoding-algebraic-data-types-in-c.html). – Doval Jan 22 '14 at 19:05
  • It's possible to avoid two of your pain points, 1: being forced to work with `object` and 2: having broken null-coalescing in return for using the `System.Data.SqlTypes` namespace. This is not ideal (I'd rather use the built in types), but it works. For example, `int? example = null;SqlInt32 nocast = example ?? SqlInt32.Null;` From a code-writing perspective, this isn't really much of a win, but at least it's easy to read. – Brian May 06 '14 at 13:25
  • @Brian yes, but the problem is to fix them both **with the same code**. Either of those "fixes" is mutually exclusive with the other. – Marc Gravell May 06 '14 at 13:28
  • @Brian ah, I misread what you were saying. The problem *then* is that you are now tying to `SqlClient` - which I strongly advise against in the general case. There are many, many good reasons to use naked ADO.NET types (not provider-specific types) as far as possible. – Marc Gravell May 06 '14 at 14:05
48

The point is that in certain situations there is a difference between a database value being null and a .NET Null.

For example. If you using ExecuteScalar (which returns the first column of the first row in the result set) and you get a null back that means that the SQL executed did not return any values. If you get DBNull back it means a value was returned by the SQL and it was NULL. You need to be able to tell the difference.

Colin Mackay
  • 18,736
  • 7
  • 61
  • 88
  • 21
    Alright, this is one good example, I suppose. Though not enough reason alone to justify making the whole mess. `ExecuteScalar` could have been rewritten differently to solve this particular problem (`DBEmptyRowset`, anyone?) – Vilx- Dec 20 '10 at 10:32
  • 2
    I'm happy with the way that Microsoft wrote ADO.NET. The use of DBNull keeps everything consistent. Any time a column value is null you get a DBNull. Sure the code to handle that can be a little more cumbersome that a simple null being returned, especially if all you do is convert DBNull to null, but at the end of the day consistency is key. If they had varied ExecuteScalar because you need to tell the difference in that scenario then it would have jarred a lot more. Of course, this is just my opinion. – Colin Mackay Dec 20 '10 at 11:01
  • 9
    Varying one `ExecuteScalar()` method vs varying all the DB providers, DataSets/Tables/Rows/Columns, data-bound controls, and what not else? I'd go for the first one. At the end of the day, that would make all code a LOT more consistent. And simple. Besides - `ExecuteScalar` is one of the least used methods in the data providers. >90% cases are with DataReader. Plus - do you really think that a `bool ExecuteScalar(out object Value)` would be that inconsistent? – Vilx- Dec 20 '10 at 11:06
  • @Blam - OK, I deleted mine too. :) Anyway, I don't think you need to worry. `DBNull` and the old APIs are here to stay - too much code relies on it. :) – Vilx- Sep 04 '13 at 06:26
  • @Vilx- "do you really think that a bool ExecuteScalar(out object Value) would be that inconsistent?" No, but it also jars. Anything with an `out` parameter jars. But then again, C# doesn't handle Tuples well (a bit of an omission to the language I think) – Colin Mackay Dec 12 '13 at 09:16
  • @ColinMackay - Ahh, the old grudges. :) The `bool/out` pattern is consistent with dictionaries, that's why I chose it originally. Alternatively it could just return an `object`, and then create a special value for an empty rowset, as opposed to a special value for an empty value. Or maybe return null, and have an optional `out bool` parameter, because in most cases you don't really care about the empty-rowset case. – Vilx- Dec 12 '13 at 09:45
  • @Vilx- Although it kinda contradicts the name of the method, `ExecuteScalar` could also return an `IEnumerable` with one or zero values. – IS4 Jul 14 '15 at 00:23
  • They could have just returned a `ScalarResult` with a `HasValue` and `Value` property. There is no good reason for the many other methods return `DBNull` instead of `null`. It's just poor design. – Lukazoid Oct 14 '15 at 13:56
  • There are many things they could have done. But what is done is done and Microsoft have to support it to ensure that all existing apps that have it don't break. – Colin Mackay Oct 14 '15 at 16:36
  • 1
    Personally, I would have preferred `ExecuteScalar` throw an exception (`DbException`) if there wasn't a value to return (no rows, or no columns), and used `null` instead of `DBNull` – Robert McKee Jul 11 '16 at 20:55
  • @ColinMackay Good point you told in your answer. but isn't the main difference between DBNULL and .NET null is; DBNULL is HardDisk NULL .NET null is Memory NULL. Since HarDisk Allocates a space for a record even if there is nothing there but memory doesn't allocate any space for null pointer since it doesn't refer any heap adress; – yigitt Dec 21 '16 at 08:48
  • 1
    @UfukSURMEN I don't get what you mean - Both memory and hard disk are storage mediums (albeit one more ephemeral than the other) so in that respect there is no difference. If the row exists then a value for a column in that row may return DBNULL to indicate that the column for that row has no value. For `ExecuteScalar` NULL means that no rows were returned, and DBNULL is that at least one row was returned but the first column has no value. – Colin Mackay Dec 21 '16 at 22:51
15

DbNull represents a box with no contents; null indicates the non-existence of the box.

Rikalous
  • 4,514
  • 1
  • 40
  • 52
  • 8
    Really? Care to elaborate on this? I kinda don't get it. Especially, since a variable ("the box") doesn't disappear when you assign a `null` to it. – Vilx- Dec 20 '10 at 10:40
  • 2
    If you consider the variable to be the address of the box - then when you assign a null to the variable, you're telling the variable that the box no longer exists. That is, the address is not 'visitable'. – Rikalous Dec 20 '10 at 10:42
  • I agree that the documentation is plain wrong about the column not existing. – Rikalous Dec 20 '10 at 10:44
  • To the extent the documentation is wrong, I suspect that it was poorly written rather than misguided. – phoog Jan 10 '12 at 02:09
  • CREATE PROC [Echo] @s varchar(max) = 'hello' AS SELECT @s [Echo]. SqlCommand cmd = new SqlCommand("Echo"); cmd.Parameters.AddWithValue("@s", null); Execute the command and look at the result (it will use the parameter's defualt value). Now change the parameter value to DBNull.Value - the proc will echo the supplied value, NULL. There is a semantical difference between DBNull and null. You could achieve the same by modifying the parameters, but here the same command can be used many times, changing only parameter values - and this is considerably faster. – The Dag May 02 '13 at 16:19
  • 1
    @TheDag, I don't think this functionality is worth the hassle of checking and converting both null and dbnull when using nullable objects. – drigoangelo Nov 14 '13 at 14:24
0

There are some differences between a CLR null and a DBNull. First, null in relational databases has different "equals" semantics: null is not equal to null. CLR null IS equal to null.

But I suspect the main reason is to do with the way parameter default values work in SQL server and the implementation of the provider.

To see the difference, create a procedure with a parameter that has a default value:

CREATE PROC [Echo] @s varchar(MAX) = 'hello'
AS
    SELECT @s [Echo]

Well-structured DAL code should separate command creation from use (to enable using the same command many times, for example to invoke a stored procedure many times efficiently). Write a method that returns a SqlCommand representing the above procedure:

SqlCommand GetEchoProc()
{
    var cmd = new SqlCommand("Echo");
    cmd.Parameters.Add("@s", SqlDbType.VarChar);
    return cmd;
}

If you now invoke the command without setting the @s parameter, or set its value to (CLR) null, it will use the default value 'hello'. If on the other hand you set the parameter value to DBNull.Value, it will use that and echo DbNull.Value.

Since there's two different results using CLR null or database null as parameter value, you can't represent both cases with only one of them. If CLR null was to be the only one, it'd have to work the way DBNull.Value does today. One way to indicate to the provider "I want to use the default value" could then be to not declare the parameter at all (a parameter with a default value of course makes sense to describe as an "optional parameter"), but in a scenario where the command object is cached and reused this does lead to removing and re-adding the parameter.

I'm not sure if I think DBNull was a good idea or not, but a lot of people are unaware of the things I've mentioned here, so I figured it worth mentioning.

The Dag
  • 1,811
  • 16
  • 22
  • 1
    I've now learned something as well: DBNull does not actually have the equals semantics of database nulls. The second point still stands though - DBNull or null really do work differently as parameter values. – The Dag May 02 '13 at 16:59
  • 1
    I think it is common belief that the DBNull and null discrepancy originally existed because .NET and SQL Server were originally built by separate teams. It's arguable if the two nulls exist now as a legacy of this separate of teams, or as something that actually benefits more than annoys. – 1c1cle Nov 17 '13 at 20:07
0

You use DBNull for missing data. Null in the .NET language means that there is no pointer for an object/variable.

DBNull missing data: http://msdn.microsoft.com/en-us/library/system.dbnull.value.aspx

The effects of missing data on statistics:

http://en.wikipedia.org/wiki/Missing_values

jvdbogae
  • 1,241
  • 9
  • 15
  • 1
    What about `Nullable` then? It's a value type, no pointers involved. Maybe it's a mistake then? – Vilx- Dec 20 '10 at 10:37
  • And also - what else is a "no pointer for an object" if not missing data? – Vilx- Dec 20 '10 at 10:38
  • 3
    @Vilx, nullable types didn't exist in .NET 1.0/1.1, so another way of representing null values was necessary – Thomas Levesque Dec 20 '10 at 10:39
  • 1
    @Thomas Levesque - your comment is worthy to be an answer, I'd say. Though it still doesn't explain why a simple "null" couldn't have been used just as well. To create a variable that could hold both a value type and DBNull, you'd have to make it a "System.Object" anyway. – Vilx- Dec 20 '10 at 10:43
  • 3
    @Thomas that logic doesn't work; to store a "value or `DBNull`" we need to use `object` - that remains the case in 2.0+; `object` can store `null` perfectly well - so even when using `object`, the `DBNull.Value` is still redundant – Marc Gravell Mar 09 '12 at 12:57
0

To answer your question, you have to consider Why Does DBNull even exist?

DBNull is necessary in a narrow use case. Otherwise, it is not. Most people never need DBNull. I never allow them to be entered into data stores I design. I ALWAYS have a value, therefore my data is never "<null>", I always choose a type meaningful 'default value', and I never have to do this absurd double check everything twice in code, once for is the object null, and again for is it DBNull before I cast my object to my actual data type (such as Int, etc).

DBNull is necessary in one case you might need... if you use some of the SQL statistics functions (eg: median, average) .. they treat DBNull specially.. go look at those docs.. some functions do not include a DBNull in the total count for the statistic... eg: (87 sum / 127 total) vs. (87 sum / 117 total) .. the difference being that 10 of those column values were DBNull... you can see this would change the statistics result.

I have no need to design my databases with DBNull. If I ever needed statistical results, I would explicitly invent or add a column such as 'UserDidProvideValue' for that one item that needs some sort of special handling because it does not exist (eg my total of 117 would be the sum of the fields marked UserDidProvideValue=true) ... haha lol - in my next life as ruler of the Universe lol - DBNull would have never been allowed to escape the SQL realm... the entire programming world is now burdened to check everything twice... when have you ever had a mobile app or desktop app or website need to have a "null" integer? - Never...

  • Oooh, going for the Necromancer badge? :) I see more than one good use case for the `null` value in DB. In fact I sometimes I wish I could define more than 1 special value. – Vilx- Jun 27 '20 at 16:15
  • Here's one very common use case: optional foreign keys. If there's a value (even a default one), then you need a record in the remote table. Null values however do not require a record in the remote table. You could of course say - "OK, but then _create_ that default record in the remote table with a fixed ID". But then you need to hardcode DB IDs in your application code, and you also need to remember to skip that record wherever you list the values from that remote table. – Vilx- Jun 27 '20 at 16:16
  • I do see how almost any `null` value could be removed by splitting one column into two - one would contain "nullness" (and maybe even other special values, like I wanted above) and the other actual data (which is only used when "nullness" is `false`). But would that be better? I'm not sure... my first instinct says that there would be many, many more `IFs` in the code and `CASEs` in the SQL since I now need to look at two columns instead of one. And I'd need lots of `CHECK` constraints to ensure data integrity. – Vilx- Jun 27 '20 at 16:24