3

I'm not really into type-safety as a concept but many consider it important for well written code and think that for some of the most important factors such as for code to scale well, be reusable, be robust etc... it needs to be type safe.

Languages like C# take type-safety seriously, they are statically typed and generics in C# are considerably more typesafe than C++ templates. Even ArrayList which is considered not as typesafe as List<> being that it is a List<object> is much more typesafe than for example a List<dynamic>, which I think should be possible but is certainly obscure.

However I wonder, again using C# as an example why there is still null. I like and approve of null but it's not in keeping with the type safety of the rest of the language.

Perhaps in some respects 'null' improves performance preventing everything needing to be constructed by default but it also requires extra time-consuming runtime checking for the computer to throw null exceptions and the such.

String s = null;
int n = s.Length; // error like a type error, even if technically it isn't
configurator
  • 40,828
  • 14
  • 81
  • 115
alan2here
  • 3,223
  • 6
  • 37
  • 62
  • 2
    See also: [Why does null exist in .NET?](http://stackoverflow.com/questions/5149074/why-does-null-exist-in-net) – wsanville Dec 27 '11 at 14:38
  • 4
    how exactly are generics "more typesafe" than C++ templates? For that matter, how do you define X to be "more typesafe" than Y? – jalf Dec 27 '11 at 14:42
  • 1
    There is no such thing as `NULL` in C#. Do you mean `null`? `NULL` in C++ is just defined as `((void *)0)`. Is that what you mean? – John Saunders Dec 27 '11 at 14:52
  • 1
    You might be interested in this talk by Sir Tony: http://www.infoq.com/presentations/Null-References-The-Billion-Dollar-Mistake-Tony-Hoare – Eric Lippert Dec 27 '11 at 15:37

2 Answers2

9

I'm not really into type-safety as a concept

Maybe you should be.

for code to scale well, be reusable, be robust etc... it needs to be type safe.

I don't think it needs to be type safe, though it surely does help. Lots of people write code in C++, a language which has relatively weak type safety (because pointers can be converted to and from arbitrary integers). Nevertheless, C++ is also a language which has been designed to encourage code reuse via robust code encapsulated in classes.

Also, don't confuse static type checking with type safety. Some would argue that dynamically-checked languages are every bit as "type safe"; they just do their type system verifications at runtime instead of compile time.

Langueges like C# take type-safety seriously

Indeed.

generics in C# are considerably more typesafe than C++ templates.

I don't see any evidence for this assertion. Type checking on generics is different than type checking on templates, but both are type checked. The fundamental difference is that a generic assumes that any type that meets the constraints could be the type argument and therefore requires the program to pass static type checking for any possible type argument. Templates on the other hand only require the type arguments that you actually use to pass static type checking.

Even ArrayList which is considered not as typesafe as List<T> being that it is a List<object> is much more typesafe than for example a List<dynamic>, which I think should be possible but is certainly obscure.

Well, dynamic is an interesting case. Like I said before, dynamic basically means "move the type checking of this thing to runtime".

I wonder why, again using C# as an example there is still null. I can see how in some respects it is faster than ensuring everything is constructed by default but isn't it a huge type-safety issue

You are right to raise this concern; nulls do throw a difficult problem at the type checker. One would like a statically-typed language to be "memory safe" -- to ensure that no invalid bit pattern ever makes its way into a variable annotated with a particular type. C# (outside of the unsafe subset) achieves this goal except that the all-zero null reference bit pattern is always legal in a variable annotated with a reference type, even though that bit pattern refers to no valid object.

You can certainly come up with languages that do not have null references and are statically type checked; Haskell is a good example. Why not do the same in C#?

Historical reasons. Null references are very useful, despite their dangers, and C# comes out of a long tradition of programming languages which allow null references.

I personally would have preferred it if we'd baked nullability into the framework from day one, so that you could have nullable or non-nullable value types, and nullable or non-nullable reference types. Remember that the next time you design a type system from scratch.

also requires extra time-consuming runtime checking for the computer to throw null exceptions and the such

That's actually not that bad. The way these things are usually implemented is the virtual memory page that contains the null pointer is marked as not readable, writable or executable. An attempt to do so then results in an exception raised by the hardware. And even in the cases where you do have to do a null check, comparing a register to zero is pretty quick.

There are worse problems. For example, arrays aren't "type safe" either in the sense that you can't statically type check whether a program is guaranteed to access only valid indices of an array. The CLR does a lot of work to make sure that every array access is valid.

Unsafe array covariance is also problematic; this is a bona-fide problem with the type system. We have to do type checks every time you write an element of a more-derived type to an array of its base type. That's my candidate for "worst feature" of the CLR.

Eric Lippert
  • 647,829
  • 179
  • 1,238
  • 2,067
  • Thanks for providing a nicely detailed answer. I see, historical and convenience. Sometimes the varying points of view of thease issues can be complicated. Do you consider dynamic to have the same type safety as var, just one is at compile time and the other at run time? :) – alan2here Dec 27 '11 at 15:52
  • 1
    @alan2here: "dynamic" has "the same" type safety in the sense that *every type error that we catch at compile time, we also catch at runtime with dynamic*. When I say that "dynamic" simply delays the type check until runtime, I literally mean it; *we start the semantic analyzer again at runtime and do the type analysis* just like we would at compile time. – Eric Lippert Dec 27 '11 at 15:54
  • 1
    "Remember that the next time you design a type system from scratch." I think we'll leave that to the Eric Lipperts of this world. :) – Ani Dec 28 '11 at 06:45
4

NULL means no instance, nothing. As such, it cannot have a type, and therefore you cannot check types on NULL.

The need for NULL comes with references: at least during creation of objects with references, those will have to point to "nothing". And NULL does just that.

Lucero
  • 59,176
  • 9
  • 122
  • 152
  • 1
    I don't think there's a *need*. It may appear more convenient, but it's hardly necessary. Just require that all variables and members of reference types are assigned a value before they are first used. Then they don't need a value (no well-defined value - obviously, *something* will be at that storage location), as no valid program would ever see that value. As for typing null, there are several options. Consider explicit "nullable" types (then, `null` is a regular value instead of a special reference) and [Bottom](http://en.wikipedia.org/wiki/bottom_type). –  Dec 27 '11 at 14:44
  • 1
    Why can't it have a type? Previous versions of the C# specification described a type for null. The ECMAScript specification describes a type for null. I don't recall whether the VB spec does or not, but it seems reasonable that it would. – Eric Lippert Dec 27 '11 at 14:44
  • 1
    @EricLippert, a ECMAScript `null` is not really comparable to a .NET CLR `null`, is it? The ECMAScript `nothing` is already closer to it, and it doesn't have a type either. That said, some notion of an unassigned/uninitialized reference is necessary; some languages abstract that away by using constructs such as "maybe" or "option". Spec# also supports the notion of non-null references, and that's a great thing IMHO, but it doesn't erase the undefined/unassigned/uninitialized/null reference existance either. – Lucero Dec 27 '11 at 14:49
  • (Regarding typed nulls, for DB purposes in the .NET Framework, we also have that `DBNull` class which does represent a "null instance" and not a "null reference" - two different concepts sometimes used with the same name.) – Lucero Dec 27 '11 at 14:54