34

Why can values be null in .NET? Is this superior to having a guarantee where everything would have a value and nothing call be null?

Anyone knows what each of these methodologies are called?

Either way, I am not very knowledgeable on this, but wouldn't having a value for everything makes things easier, in terms of simplicity, i.e. eliminating null checks, and being able to write more streamlined algorithms that doesn't have to branch out for checks.

What are the pros and cons of each style in terms of performance, simplicity, parallellism, future-proofing, etc.

Joan Venge
  • 315,713
  • 212
  • 479
  • 689
  • Personally, I'd rather have to check for a null value than catch an exception which is likely what you'd have to do otherwise. – Metro Smurf Mar 01 '11 at 00:10
  • 3
    @Metro Smurf not necessarily, some languages treat any calls to a null reference as a black hole, like objective-c. To me, the pros and cons of that approach is a more interesting question. – Matt Greer Mar 01 '11 at 00:18
  • 6
    This is the sort of question I'd love @Eric Lippert to weigh in on... – Dan J Mar 01 '11 at 00:40
  • 1
    @Ravi Gummadi `Nullable` does not simplify anything wrt. null checks, unfortunately (while handy in some cases, it arguably complicates the situation!). It is constrained to only value-types and has no language support for pattern-matching: it must still be guarded with fallible if-conditionals if needed to check the "null"-state (e.g. when just a "get value or default" or "??" doesn't work). –  Mar 01 '11 at 02:07
  • Many of the languages listed that don't have or prefer to do without the notion of null are **functional** languages. Null is an object-oriented concept. The ability of a variable to represent the absence of an object is virtually indispensable in an object-oriented language. – snarf Mar 01 '11 at 02:28
  • 1
    Doesn't this belong on Programmers? – Neal Tibrewala Mar 01 '11 at 04:51
  • 3
    @Snarfblam I reject that statement. `null` exists in "mainstream OO languages" (C++, Java, C#, Python, Obj-C, JavaScript, even Scala, and whatnot) -- yet there is nothing in these languages that *require `null`* -- I suspect most is because of 1) "that's how language X works" or 2) "that's how language X works" (there is no "because it's required" reason). It could be completely replaced with an appropriate Option/Maybe type *at the language level* (but then it would be *an entirely different language* -- but could still be "subtype-polymorphic OO"!). Also, consider `NULL` in "non-OO"-C. –  Mar 01 '11 at 06:33
  • @pst, I never meant that null is positively required. It's simply an extremely useful concept. I also never said that it's impossible for an OO language to do without null or that a non-OO language must not have null. Nulls in C are beside the point. Option/Maybe just boils down to the same thing as null, no? Though I'll agree that it sounds like a great idea to require that variables must be explicitly declared as being able to be null. I'd be really interested to see how many concepts and patterns are approached without null. – snarf Mar 01 '11 at 23:28
  • 1
    @Snarfblam `null` itself is only a useful concept only insofar as it has been programmed into memory as such -- needing to know "the lack of something" is useful, indeed. However, supporting the `null` concept idiomatically ... much less useful for a modern *statically typed* language. There has been some attempt in Java to introduce `@notnull` which indicates (imoho) that *"nothing" is a general type problem not well-handled with `null`* –  Mar 06 '11 at 20:55
  • @pst, I hear you. Like I said, I agree that it sounds like a great idea to require that variables must be explicitly declared as being able to be null. – snarf Mar 07 '11 at 23:48
  • There's a distinction between the C# language and the .NET platform. Are you asking why null exists in the .NET platform that runs beneath all .NET languages, or are you asking why the C# language exposes null to the programmer? Different questions. – dthorpe Aug 02 '12 at 17:18
  • possible duplicate of [Why is "null" present in C# and java?](http://stackoverflow.com/questions/178026/why-is-null-present-in-c-sharp-and-java) – nawfal Apr 01 '13 at 20:19

11 Answers11

38

We've got Tony Hoare, an early pioneer that worked on Algol to thank for that. He rather regrets it:

I call it my billion-dollar mistake. It was the invention of the null reference in 1965. At that time, I was designing the first comprehensive type system for references in an object oriented language (ALGOL W). My goal was to ensure that all use of references should be absolutely safe, with checking performed automatically by the compiler. But I couldn't resist the temptation to put in a null reference, simply because it was so easy to implement. This has led to innumerable errors, vulnerabilities, and system crashes, which have probably caused a billion dollars of pain and damage in the last forty years.

A billion is a low-ball number, I think.


UPDATE: C# version 8 and .NETCore have a decent solution for this problem, check out non-nullable reference types.

Hans Passant
  • 922,412
  • 146
  • 1,693
  • 2,536
  • Thanks Hans. This is exactly the sort of stuff I am looking for. I remember in one of Anders' video where he was also saying they made a mistake for this or something like that but I don't remember the video, but it was from C9. – Joan Venge Mar 01 '11 at 00:12
  • 4
    Ah, good reference, but how would a world without `null`s look? How do you terminate a List? – H H Mar 01 '11 at 00:12
  • 4
    @Henk: Not that I'd necessarily advocate it as an absolute, but just like we do with `Nullable`: a flag. – Adam Robinson Mar 01 '11 at 00:15
  • @Henk - F# [does it like this](http://msdn.microsoft.com/en-us/library/dd233245.aspx), and it's really nice to work with in the context of that language. – Joel Mueller Mar 01 '11 at 00:15
  • 3
    @Joel and under the hood F# actually uses `null` to implement several of it's constructs (Option in particular) :) – JaredPar Mar 01 '11 at 00:17
  • 1
    @Joan, @Hans, from memory: some big-shot from the SQL world once wrote a similar mea-culpa about NULL. – H H Mar 01 '11 at 00:18
  • I wish I could find Anders' video too. I am not 100% sure but I am close to sure than not. – Joan Venge Mar 01 '11 at 00:19
  • @JaredPar - I'm aware of that. The CLR is what it is, under the hood. But behind the wheel, the type system tells me if a given function might or might not return a value, and forces me to deal with the possibilities in a way that makes it nearly impossible for my code to be the origin of a null-reference exception. It would be nice if the framework libraries were as reliable. – Joel Mueller Mar 01 '11 at 00:20
  • 4
    @Joel I 100% agree. I've gone so far as to introduce a `Maybe` type into many of the code bases I work on. I find it produces much better code in the end. – JaredPar Mar 01 '11 at 00:22
  • @Jared: What does Maybe do? – Joan Venge Mar 01 '11 at 00:25
  • 2
    @Joan it provides a declarative way of expressing a reference can be `null` / not present. It's developed in the style of the F# option type. http://blogs.msdn.com/b/jaredpar/archive/2008/10/08/functional-c-providing-an-option-part-2.aspx – JaredPar Mar 01 '11 at 00:27
  • 4
    I don't see how the invention of the null pointer was a mistake. Having a standard way of defining that it should not be possible to dereference a pointer seems like a very good idea. The run-time cost of having a pointer with a single recognizable null value is in many cases less than the cost of allowing for a wider variety of null values. The only design mistake I see with regard to null pointers is the fact that many C compilers allow one to index a null pointer without trapping, yielding an invalid pointer that will no longer recognized as null. – supercat Mar 08 '11 at 19:35
24

As appealing as a world without null is, it does present a lot of difficulty for many existing patterns and constructs. For example consider the following constructs which would need major changes if null did not exist

  1. Creating an array of reference types ala: new object[42]. In the existing CLR world the arrays would be filled with null which is illegal. Array semantics would need to change quite a bit here
  2. It makes default(T) useful only when T is a value type. Using it on reference types or unconstrained generics wouldn't be allowed
  3. Fields in a struct which are a reference type need to be disallowed. A value type can be 0-initialized today in the CLR which conveniently fills fields of reference types with null. That wouldn't be possible in a non-null world hence fields whos type are reference types in struct's would need to be disallowed

None of the above problems are unsolvable but they do result in changes that really challenge how developers tend to think about coding. Personally I wish C# and .Net was designed with the elimination of null but unfortunately it wasn't and I imagine problems like the above had a bit to do with it.

JaredPar
  • 733,204
  • 149
  • 1,241
  • 1,454
  • Thanks Jared. This is what I am wondering. Why didn't they design NET without null? Also if they did, would this not be superior in terms of performance and code simplicity? Also can you please explain #3, like why fields in structs aren't allowed? – Joan Venge Mar 01 '11 at 00:17
  • 1
    @Joan, suppose I had `struct S1 { object field1; }`. What would the value of `field1` be after the expression `new S1()` executed? Today it would be `null` but in a world without `null` this would be illegal hence either default initialization of structs or reference type fields in structs would need to be illegal – JaredPar Mar 01 '11 at 00:20
  • 1
    Thanks Jared, that makes a lot of sense. So if smart people like you working at MS could see the usefulness in a world without null, then who is making the decisions against it before they started designing .NET. I assume this is something that was debated and decided against for some reason. – Joan Venge Mar 01 '11 at 00:24
  • 2
    @Joan unfortunately I wasn't around back then and don't know the details of the debate (if there actually was one). Java compatibility was a factor in several design decisions and my **guess** is that it also played into the decision around `null`. But once again I don't have any direct knowledge of this. – JaredPar Mar 01 '11 at 00:26
  • Thanks Jared, appreciate your take on this. – Joan Venge Mar 01 '11 at 00:27
  • 3
    A big design backing with .NET comes from being compatible with existing libraries. It's why we have pointers and value types with structs that can be cast directly to native types in C among other things. This was a slight departure from the ideology java has (banning unsigned types and function pointers there). That means bringing along null. – Zac Bowling Mar 01 '11 at 03:51
  • @JaredPar Rather than designing without `null`, what would help is language support for non-nullable reference types, similar to how we have support for nullable value types. – Sam Harwell Jan 18 '14 at 03:00
  • A Maybe type like in Haskell would help solve these problems. The advantage of this over nulls is being able to control where a nothing value is allowed and where it isn't. In most code you don't need or want to allow nulls. – Martin Capodici Jun 29 '16 at 20:47
9

This reminds me of an episode of James Burke's "Connections" series where monks were transcribing arabic to latin and first encountered a zero digit. Roman arithmetic did not have a representation for zero, but arabic/aramaic arithmetic did. "Why do we have to write a letter to indicate nothing?" argued the Catholic monks. "If it is nothing, we should write nothing!"

Fortunately for modern society, they lost the argument and learned to write zero digits in their maths. ;>

Null simply represents an absence of an object. There are programming languages which do not have "null" per se, but most of them do still have something to represent the absence of a legitimate object. If you throw away "null" and replace it with something called "EmptyObject" or "NullNode", it's still a null just with a different name.

If you remove the ability for a programming language to represent a variable or field that does not reference a legitimate object, that is, you require that every variable and field always contain a true and valid object instance, then you make some very useful and efficient data structures awkward and inefficient, such as building a linked list. Instead of using a null to indicate the end of the linked list, the programmer is forced to invent "fake" object instances to serve as list terminals that do nothing but indicate "there's nothing here".

Delving into existentialism here, but: If you can represent the presence of something, then isn't there a fundamental need to be able to represent the absence of it as well?

dthorpe
  • 35,318
  • 5
  • 75
  • 119
6

I speculate that their exists null in .NET because it (C#) followed in the C++/Java foot-steps (and has only started to branch-out in more recent versions) and VB/J++ (which became VB.NET/J#) already had the notion of "nothing" values -- that is, .NET has null because of what was and not because of what it could have been.

In some languages there is no notion of null -- null can be completely replaced with a type like Maybe -- there is Something (the object) or Nothing (but this is not null! There is no way to get the "Nothing" out of an Maybe!)

In Scala with Option:

val opt = Some("foo") // or perhaps, None
opt match {
   case Some(x) => x.toString() // x not null here, but only by code-contract, e.g. Some(null) would allow it.
   case _ => "nothing :(" // opt contained "Nothing"
}

This is done by language design in Haskell (null not possible ... at all!) and by library support and careful usage such as in Scala, as shown above. (Scala supports null -- arguably for Java/C# interop -- but it is possible to write Scala code without using this fact unless null is allowed to "leak" about).

Edit: See Scala: Option Pattern, Scala: Option Cheat Cheet and SO: Use Maybe Type in Haskell. Most of the time talking about Maybe in Haskell brings up the topic of Monads. I won't claim to understand them, but here is a link along with usage of Maybe.

Happy coding.

Community
  • 1
  • 1
  • Thanks, posting the relevant comment here: "Btw pst, Haskell is the only language that has no null? But Maybe? And that Maybe never returns null/nothing?" – Joan Venge Mar 01 '11 at 00:32
  • 1
    @Joan Venge I suspect there are other languages (esp. dependent-typed languages) with no `null`. There is just *no way* to get "Nothing" out of Maybe (one can store a `null` in an Option in Scala, but generally this is not done as it mitigates a huge advantage of Option to start with -- Some(not_null_value)/None would be the idiomatic encoding :-) –  Mar 01 '11 at 00:34
  • Thanks pst, I will have to read more about this Maybe stuff because I am not familiar with it at all. Just don't get it currently :O – Joan Venge Mar 01 '11 at 00:36
  • 1
    @Joan Venge I added some links to my post. Note the None/Nothing branches have no access to any new values! Happy coding. –  Mar 01 '11 at 00:42
  • Thanks pst. I will be sure to read your stuff. Are you a Haskell expert? – Joan Venge Mar 01 '11 at 00:46
  • 1
    @Joan Venge Haha. No. Not by any means -- "Hello world!" is about my limit. I do fairly well in the [A "Gentle" Introduction to Haskell"](http://www.haskell.org/tutorial/) up until Monads (chapter 6'ish?) -- never was able to make it past that. Scala, warts and all, is much more my cup of tea. –  Mar 01 '11 at 00:49
  • Thanks pst. I always hear about Haskell so it intrigues me. But is it super hard to learn? I was hoping it would open my horizons without being extremely complex. – Joan Venge Mar 01 '11 at 00:51
  • 1
    @Joan Venge Just approach Haskell with an open-mind. I introduced Haskell to my (formerly imperative-C#-only friend) after I introduced him to Scala (during our CS program we strived to use different languages for each assignment ;-). He uses Scala for some projects now (IMOHO: Scala > C# [in many cases] >>> Java ;-), but he *absolutely loves* Haskell and uses it whenever he has a chance. (There is only much to learn, and unless time is never wasted, nothing to lose ;-) –  Mar 01 '11 at 01:01
  • Thanks pst, appreciate your advice. I am also a C# only guy, I have used some languages like Python but not really a fan of it. But you think Scala is better than Haskell? Are they in the same category? What about Lisp? I hear that a lot too but not as much as Haskell. Also have interest in functional languages like F# so would be cool to have some experience in those. – Joan Venge Mar 01 '11 at 01:14
  • 1
    @Joan Venge I like Scala (I'm biased): Scala is "statically-typed OO + many functional constructs + good Java integration". It is not a perfect language: it doesn't try to be. Its strength (and flaw) is that is designed to be able to *replace* Java. Haskell is a "truly functional" language. It was designed by people who love to talk about lambda calculus and other academic topics and, while still not the "ultimate language", is a "pure" (haha, pun!) language ;-) F#/Ocaml are "functional + OO features"; an island between Scala and Haskell. Python is dynamically typed and thus not comparable. –  Mar 01 '11 at 01:50
5

Ok now wrap to the magic word of C#-without-null

class View
{
    Model model;        

    public View(Model model)
    {
        Console.WriteLine("my model : {0}, thing : {1}", this.model, this.model.thing);
        this.model = model;
    }
}

What is printed on the console?

  • Nothing an exception about accessing an un-initialized object is thrown : Ok call this a NullReferenceException and that's the current world.
  • It doesn't build, the user needed to specify a value when declaring model, see last bullet point as it create the same result.
  • Some default for the model and some default for the thing : Ok before with null we at least had a way to know if an instance was correct now the compiler is generating strange look-alikes that don't contain anything but are still invalid as model objects...
  • Something defined by the type of the objects : Better as we could define a specific invalid state per object but now each object that could be invalid need to independently implement this along with a way to identify this state by the caller...

So basically for me it don't seem to solve anything to remove a null state, as the possibly invalid state still need to be managed anyway...

Oh buy the way what would be the default value of an interface ? Oh and an abstract class what would happen when a method is called on a default value that is defined in the abstract class but that call another method that is abstract ? .... .... Why oh why complicating the model for nothing, it's multiple-inheritance questions all over again !

One solution would be to change the syntax completely to go for a full functional one where the null world doesn't exits, only Maybes when you want them to exists... But it's not a C like language and the multi-paradigm-ness of .Net would be lost.

What might be missing is a null-propagating operator able to return null to model.Thing when model is null, like model.?.Thing


Oh and for good mesure an answer to your question :

  • The current class library evolved after the Microsoft-Java debacle and C# was build as a "better-Java" so changing the type system to remove null references would have been a big change. They already manage to introduce value types and removed manual boxing !
  • As the value type introduction show microsoft think a lot about speed... The fact that the the default for all types map to a zero fill is really important for fast array initialization for example. Otherwise initialization of arrays of reference values would have need special threatment.
  • Without null interop with C would not have been possible so at least at the MSIL level and in unsafe block they need to be allowed to survive.
  • Microsoft wanted to use the framework for VB6++ removing Nothing as it is called in VB would have radically changed the language, it already took years for users to switch from VB6 to VB.Net such a paradigm change might have been fatal for the language.
Julien Roncaglia
  • 17,397
  • 4
  • 57
  • 75
3

Well, values (value-type vars) can only be null since the nullable types were introduced in Fx2.

But I suppose you mean:

Why can references be null ?

That is part of the usefulness of references. Consider a Tree or LinkedList, they would not be possible (unable to end) without null.

You could come up with many more examples, but mainly null exists to model the concept of 'optional' properties/relationships.

H H
  • 263,252
  • 30
  • 330
  • 514
  • 3
    Seems to me those constructs could still exist; you'd just need something like a `NullNode` object or some similar concept. Granted I have a feeling whatever you could come up with to address this would not really be any better than `null` itself. – Dan Tao Mar 01 '11 at 00:14
  • 1
    @Dan: Yes, but then we would ask that 'terminator' object for its value or its Next. "Trust me, I'll bee bahck" ... with an Exception. – H H Mar 01 '11 at 00:22
  • 1
    Indeed. Some languages have no notion of `null` -- `null` can be *completely* replaced with a type like Maybe -- there is Something (the object) or Nothing (but this is not `null`! There is no way to get the "Nothing" out of an Maybe!). This is done *by language design* in Haskell and *by library support* in Scala (Scala supports null -- arguably for Java/C# interop, but it is possible to write Scala code without using this fact). –  Mar 01 '11 at 00:23
  • LMAO, great comment! Btw pst, Haskell is the only language that has no null? But Maybe? And that Maybe never returns null/nothing? – Joan Venge Mar 01 '11 at 00:29
  • Yeah, that's kind of what I was thinking too. Either accessing `Next` would throw an exception, or it would return the same terminator object. Actually, I think that in a null-free world, what would probably happen is that instead of checking `Next` for `null` you'd have something like a `bool TryGetNext(out Node next)` method; the terminating node would simply return `false` (and `next` would be assigned to itself). – Dan Tao Mar 01 '11 at 00:31
3

Hysterical raisins.

It's a hangover from C-level languages where you live on explicit pointer manipulation. Modern declarative languages (Mercury, Haskell, OCaml, etc.) get by quite happily without nulls. There, every value has to be explicitly constructed. The 'null' idea is handled through 'option' types, which have two values, 'no' (corresponding to null) and 'yes(x)' (corresponding to non-null with value x). You have to unpack each option value to decide what to do, hence: no null pointer reference errors.

Not having nulls in a language saves you so much grief, it really is a shame the idea still persists in high-level languages.

Rafe
  • 5,237
  • 3
  • 23
  • 26
  • 1
    Null is a concept, not a keyword. It refers to absence, lack of presence, non-existence. The idea is present if you have an option type. An option type then is analogous to a nullable pointer, sans the memory-management facilities. – Sion Sheevok Mar 01 '11 at 01:30
  • @Sion - sure, but the key is that `null` or `None` or whatever you want to call it is only a valid value for a limited set of types. There are certainly cases where you want to use a sentinel value, but by the same token there are other cases where null values wouldn't make any sense (and we'd like the compiler to verify that a non-null value is being used). In C# there are no non-nullable reference types, so `null` is a "valid" value even when you don't want it to be. – kvb Mar 01 '11 at 18:32
  • Same with C++. You can't make a null int, null char, null bool, or null ThisIsAnInstanceOfAClass. The only nullable type is a pointer. – Sion Sheevok Mar 01 '11 at 19:06
  • @Sion Sheevok - the difference is that an option type will never lead to a null reference exception. Moreover, you can use algebraic data types to distinguish between "not initialised", "empty", and "anything else" in an unambiguous fashion. Mercury will even go so far as to check whether your "switch" statements consider the same constructor more than once or not at all. Having had to move to C# for a living, things feel much more primitive. – Rafe Mar 02 '11 at 12:50
2

I'm not familiar with alternatives either, but I don't see a difference between Object.Empty and null, other than null lets you know that something is wrong when your code tries to access the object, wheras Object.Empty allows processing to continue. Sometimes you want one behavior, and sometimes you want the other. Distinguishing null from Empty is a useful tool for that.

Ritch Melton
  • 11,498
  • 4
  • 41
  • 54
1

Many people probably can't wrap their head around coding without nulls, and if C# didn't have nulls, I doubt it would have caught on to the extent that it has.

That being said, a nice alternative would be, if you want to allow a nullable reference, then the reference would have to be explicitly nullable, like with value types.

For example,

Person? person = SearchForPersonByFirstName("Steve");
if (person.HasValue)
{
    Console.WriteLine("Hi, " + person.Value.FullName);
}

Unfortunately, when C# 1.0 came out, there was no concept of Nullable; that was added in C# 2.0. Forcing references to have values would have broken the older programs.

jimmyfever
  • 387
  • 3
  • 11
1

null is just the name of the default value for a reference type. If null was not allowed, then the concept of "doesn't have a value" wouldn't go away, you would just represent it in a different way. In addition to having a special name, this default value also has special semantics in the event that it is misused - i.e. if you treat it like there is a value, when in fact there is not.

If there were no null:

  1. You would need to define sentinel default values for your types to use when "there is no value". As a trivial example, consider the last node of a forward singly-linked list.
  2. You would need to define semantics for operations performed on all of these sentinel values.
  3. The runtime would have additional overhead for handling the sentinel values. In modern virtual machines such as the ones used by .NET and Java, there is no overhead for null-checking prior to most function calls and dereferences because the CPU provides special ways to handle this case, and the CPU is optimized for cases where the references are non-null (e.g. branch prediction).

In summary:

  • The name would change, but the concept would not.
  • The burden of defining and communicating the default value and semantics for cases where you have "no value" is placed on developers and developer documentation.
  • Code executing on modern virtual machines would face code bloat and substantial performance degradation for many algorithms.

The problems with null described by Tony Hoare are typically due to the fact that prior to modern virtual machines, runtime systems did not have nearly as clean handling of misused null values like you have today. Misusing pointers/references does remain a problem, but tracking down the problem when working with .NET or Java tends to be much easier than it used to be in e.g. C.

Sam Harwell
  • 97,721
  • 20
  • 209
  • 280
  • IMHO, the big problem with nulls has historically been that while C compilers could often be configured insert null checks to ensure that directly-referenced pointers weren't null, they would not do any such checks when performing pointer arithmetic. Given `int *p`, a typical 16-bit compiler's generated code for `p[5]=3;` might add 10 to `p`, then check if the result was non-null, and store a 3 there if so. – supercat Apr 27 '15 at 20:35
1

To denote the nothingness concept, since 0 is not the right fit.

Now you can give any value type a Null value by defining a nullable type.

I think we can't have a value always for the variable because at first we have to default it to some value, and here comes the question why a specific value takes advantage over the others.

Ken D
  • 5,880
  • 2
  • 36
  • 58