8

Null pointers have been described as the "billion dollar mistake". Some languages have reference types which can't be assigned the null value.

I wonder if in designing a new object-oriented language whether the default behavior should be for references to prevent being assigned null. A special version of the could then be used to override this behavior. For example:

MyClass notNullable = new MyClass();
notNullable = null; // Error!
// a la C#, where "T?" means "Nullable<T>"
MyClass? nullable = new MyClass();
nullable = null; // Allowed

So my question is, is there any reason not to do this in a new programming language?

EDIT:

I wanted to add that a recent comment on my blog pointed out that non-nullable types have a particular problem whenb used in Arrays. I also want to thank everyone for their useful insights. It is very helpful, sorry I could only choose one answer.

Josh Lee
  • 171,072
  • 38
  • 269
  • 275
cdiggins
  • 17,602
  • 7
  • 105
  • 102
  • 4
    switch to functional programming language ;-) – jldupont Dec 02 '09 at 03:25
  • 1
    @jldupont: Please post that as an answer. – S.Lott Dec 02 '09 at 03:29
  • I added a link. Of course nulls are not as big of a problem in Java, C#, or Python, as in C or C++. However, there is still a lot of boiler-plate code in these languages dedicated to null checking, and lots of developer time spent chasing null pointers. Kind of silly when you think that it could be avoided at compile-time, by adding an extra character. – cdiggins Dec 02 '09 at 03:30
  • 5
    a null _pointer_ is dangerous. a null object reference is not; if you didn't use null to indicate an uninitialized object reference, you'd have to invent another sentinel value to serve the same purpose – Steven A. Lowe Dec 02 '09 at 03:39
  • 3
    @Steven A Lowe - the point is to get rid of the possibility of the 'uninitialized state'/sentinel when it is unnecessary (the common case). You can always opt back in with a nullable type when needed. – Brian Dec 02 '09 at 03:42
  • BTW, the recorded talk is here: http://www.infoq.com/presentations/Null-References-The-Billion-Dollar-Mistake-Tony-Hoare – Brian Dec 06 '09 at 21:02
  • possible duplicate of [non-nullable-reference-types](http://stackoverflow.com/questions/693325/non-nullable-reference-types?lq=1), [what-is-the-purpose-of-null](http://stackoverflow.com/questions/584507/what-is-the-purpose-of-null) – nawfal Jul 08 '14 at 10:31

7 Answers7

5

The main obstruction I see to non-nullable reference types by default is that some portion of the programming community prefers the create-set-use pattern:

x = new Foo()
x.Prop <- someInitValue
x.DoSomething()

to overloaded constructors:

x = new Foo(someInitValue)
x.DoSomething()

and this leaves the API designer in a bind with regards to the initial value of instance variables that might otherwise be null.

Of course, like 'null' itself, the create-set-use pattern itself creates lots of meaningless object states and prevents useful invariants, so being rid of this is really a blessing rather than a curse. However it does affect a bit of API design in a way that many people will be unfamiliar with, so it's not something to do lightly.

But overall, yes, if there is a great cataclysm that destroys all existing languages and compilers, one can only hope that when we rebuild we will not repeat this particular mistake. Nullability is the exception, not the rule!

Brian
  • 117,631
  • 17
  • 236
  • 300
  • 1
    A non-nullable x.Prop should be set to a meaningful default or empty value. The consumer of the API can then override it as before, but know from the type that they can't set it to null. – ligos Dec 02 '09 at 04:29
  • I think a non-nullable x.Prop should be guaranteed to be initialized before it is used. If you need an "uninitialized" state, then that's what null is for, and perhaps x.Prop should actually be nullable. – jyoungdev Jun 05 '10 at 04:21
4

I like the Ocaml way of dealing with the 'maybe null' issue. Whenever a value of type 'a might be unknown/undefined/unitialized, it is wrapped in an 'a Option type, which can be either None or Some x, where x is the actual non-nullable value. When accessing the x you need to use the matching mechanism for unwrapping. Here is a function that increases a nullable integer and returns 0 on None

>>> let f = function  Some x -> x+1 | None->0 ;;
val f : int option -> int = <fun>

How it works:

>>> f Some 5 ;;
- : int = 6
>>> f None ;;
- : int = 0

The matching mechanism sort of forces you to consider the None case. Here's what happens when you forget it:

 >>> let f = function  Some x -> x+1 ;;
 Characters 8-31:
 let f = function  Some x -> x+1 ;;
         ^^^^^^^^^^^^^^^^^^^^^^^
 Warning P: this pattern-matching is not exhaustive.
 Here is an example of a value that is not matched:
 None
 val f : int option -> int = <fun>

(This is just a warning, not an error. Now if you pass None to the function you'll get a matching exception.)

The variant types + matching is a generic mechanism, it also works for things like matching a list with head :: tail only (forgetting the empty list case).

Rafał Dowgird
  • 43,216
  • 11
  • 77
  • 90
1

Even better, disable null references. In rare cases when "nothing" is a valid value, there could be an object state that corresponds to it, but a reference would still point to that object, not have a zero value.

Nemanja Trifunovic
  • 24,346
  • 3
  • 50
  • 88
1

As I understand, Martin Odersky's rationale for including null in Scala is to easily use Java libraries (i.e. so all your api's don't appear to have, e.g., "Object?" all over the place):

http://www.artima.com/scalazine/articles/goals_of_scala.html

Ideally, I think null should be included in the language as a feature, but non-nullable should be the default for all types. It would save lots of time and prevent errors.

jyoungdev
  • 2,674
  • 4
  • 26
  • 36
0

The biggest "null-related mistake" in language design is the lack of a trap when indexing null pointers. Many compiler will trap when trying to dereference a null pointer will not trap if one adds an offset to a pointer and tries to dereference that. In the C standard, trying to add the offset is Undefined Behavior, and the performance cost of checking the pointer there would be no worse than checking the dereference (especially if the compiler could realize that if it checked that the pointer was non-null before adding the offset, it might not need to re-check afterward).

As for language support for non-nullable variables, it may be useful to have a means of requesting that certain variables or fields whose declarations include an initial value should automatically test any writes to ensure that an immediate exception will occur if an attempt is made to write null to them. Arrays could include a similar feature, if there were an efficient idiomatic way of constructing an array by constructing all the elements and not making the array object itself available before construction was complete. Note that there should probably also be a way of specifying a cleanup function to be called on all previously-constructed elements if an exception occurs before all elements have been constructed.

Finally, it would be helpful if one could specify that certain instance members should be invoked with non-virtual calls, and should be invokable even on null items. Something like String.IsNullOrEmpty(someStringVariable) is hideous compared with someStringVariable.IsNullOrEmpty.

supercat
  • 77,689
  • 9
  • 166
  • 211
-1

Null is only a problem because developers don't check that something is valid before using it, but, if people start to misuse the new nullable construct it will not have solved any real problems.

It is important to just check that every variable that can be null is checked before it is used, and if this means that you have to use annotations to allow bypassing the check then that may make sense, otherwise the compiler could fail to compile until you check.

We put more and more logic into compilers to protect developers from themselves, which is scary and very sad, as we know what we should do, and yet sometimes skip steps.

So, your solution will also be subject to abuse, and we will be back to where we started, unfortunately.

UPDATE:

Based on some comments here was a theme in my answers. I guess I should have been more explicit in my original answer:

Basically, if the goal is to limit the impact of null variables, then have the compiler throw an error whenever a variable is not checked for null, and if you want to assume that it will never be null, then require an annotation to skip the check. This way you give people the ability to assume, but you also make it easy to find all the places in the code that have the assumption, and in a code review it can be evaluated if the assumption is valid.

This will help to protect while not limiting the developer, but making it easy to know where it is assumed not to be null.

I believe we need flexibility, and I would rather have the compilation take longer than have something negatively impact the runtime, and I think my solution would do what is desired.

James Black
  • 41,583
  • 10
  • 86
  • 166
  • 1
    Your third paragraph boggles my mind. You want to avoid automated solutions to common problems because humans need the extra work to promote the ethic of not 'skipping steps'? – Brian Dec 02 '09 at 03:38
  • 1
    Yeah, I have to agree with Brian here. More work is put into compilers to make developers more productive. Pragmatism not idealism. – jason Dec 02 '09 at 03:45
  • @Brian - I suggested that we could force the developer to state that some variable may be null, and won't be checked, as we assume it won't be null, but that is as far as I would go, as that is hard to abuse, and if it is abused, it is easy to write a tool that checks for all the non-checks and decide if they are valid, in a code review, for example. – James Black Dec 02 '09 at 03:46
  • @Jason - That is fine, if we put something in the compiler, we should make it easy to know what the intent was, so it can be checked. But, there are so many ways to put in security holes, developers need to be more cautious, otherwise we may as well ban dynamic sql queries and force all queries to use prepared statements. – James Black Dec 02 '09 at 03:48
  • @James Black: The compiler is an implementation detail. – jason Dec 02 '09 at 03:53
  • @barkmadley - I find dynamic SQL to be useful, but that is when I am using nothing from the user in the query, esp if there is no parameters to pass in. The Haskell solution was interesting, btw. – James Black Dec 02 '09 at 03:57
  • http://blog.moertel.com/articles/2006/10/18/a-type-based-solution-to-the-strings-problem dynamic SQL bad...deleted my other comment – barkmadley Dec 02 '09 at 03:58
  • @Jason - Just don't put something into the language that can be abused as the null pointer is currently. That, to me, is the most pragmatic solution, as then someone can look to see where the developer is assuming that it can't be null, and verify that the assumption is valid. – James Black Dec 02 '09 at 04:05
  • 2
    @James Black: It's nearly impossible to prevent language abuse. There is no construct nor tool that can save bad programmers from themselves (short of hacking off their fingers). Code contracts, however, are a really beautiful idea. – jason Dec 02 '09 at 04:12
  • @James Black: If the compiler keeps a value from being null, that is essentially using a non-nullable type, except devs can still change it to null later (assuming the reference is settable). I _want_ the compiler to make my life easier by guaranteeing against this so that the value _is not null_, not just "assumed not to be null". – jyoungdev Jun 05 '10 at 04:14
  • @apollodude217 - If you use a string, for example, what is the value when it is not set? It is in an unknown state, as you have never explicitly set it, so, null is as good a value as any, otherwise you are assuming what it is. That is a problem found in some languages where pointers in debug mode may be set to zero, but in release are not set at all, so may point anywhere. So, if you don't want it to be in an unknown state, just set it when you declare it. – James Black Jun 05 '10 at 19:14
  • @James Black: You already proposed a compiler feature for ensuring that values are initialized to something before they are accessed, thus removing the need for null in these cases. These are the cases where nullability is a nuisance (imao). If the variable may represent nothing or be uninitialized, then it should be of a nullable type. – jyoungdev Jun 07 '10 at 16:11
  • @apollodude217 - Any solution will have trade-offs between protecting the programmer from stupid mistakes and annoying the programmer. I tend to opt for letting the programmer create problems if he chooses, but try to ensure that in a non-intrusive way you try to protect the programmer. – James Black Jun 07 '10 at 16:42
-1

No.

The state of uninitialized will exist in some fashion due to logical necessity; currently the denotation is null.

Perhaps a "valid but uninitialized" concept of an object can be designed in, but how is that significantly different? The semantics of "accessing an uninitialized object" still will exist.

A better route is to have a static-time checking that you do not access an object that is not assigned to(I can't think of something off the top of my head that would prevent that besides string evals).

Paul Nathan
  • 39,638
  • 28
  • 112
  • 212
  • 2
    The point I was trying to make was that the need of an "uninitialized" or "sentry value" for a variable or argument is the exception rather than the rule. When it is needed, I see no reason not to use "null", and the nullable form (e.g. "MyType?"). – cdiggins Dec 04 '09 at 15:16
  • "static-time checking that you do not access an object that is not assigned to" gurantees initialization, but does not guarantee against null (i.e. it can still be initialized or later changed to null). These are separate problems. What if you have a method parameter that should never be null? A static check to make sure that value is not null _is_ using a non-nullable type. Checking for null manually and dealing with NullReferenceException are both painful. – jyoungdev Jun 05 '10 at 04:07
  • No, types *can* be used. They don't have to be... – Paul Nathan Jun 05 '10 at 04:31