6

We do come across this particular and one of the most common exception in our coding/development life day or another day. My Question is NOT about WHY (I am aware it raises when we try to access properties of a reference variable which actually points to null) but its is about HOW the NULL REFERENCE EXCEPTION is generated by CLR.

Sometimes I am forced to think the mechanism for identifying a reference to a null (Perhaps null is a reserved space in memory) and then raising an Exception by CLR. How CLR identify and raises this particular Exception. Does OS play any role in it?

I would like to share one of the most interesting claims about it:

null is actually an all time reserved memory space known to CLR, and all kind of access are prohibited. Thus , when reference for that space is found, it by default generates access denied kind of exception via OS which is interpreted as a NULL Reference Exception by CLR.

I didn't found any articles or posts supporting the above statement, thus hard to believe it. Might by I am missing to dig in details or other reasons, I expect Stackoverflow is one of the most appropriate platform where I will get the best response.

AakashM
  • 62,551
  • 17
  • 151
  • 186
Sumeet
  • 905
  • 1
  • 14
  • 32
  • Read up on the `callvirt` IL instruction. – leppie Jun 28 '12 at 06:52
  • 3
    You didn't find *any* articles or posts about that quoted statement? Then *where* did it even come from? – Damien_The_Unbeliever Jun 28 '12 at 06:55
  • While discussing inhouse with my coligues... – Sumeet Jun 28 '12 at 06:56
  • [This article](http://blogs.msdn.com/b/brada/archive/2004/02/23/78341.aspx) may contain some information that points into the right direction. – O. R. Mapper Jun 28 '12 at 07:05
  • I'd guess this is an implementation detail that the CLI spec leaves open. But the article posted by O. R. Mapper suggests that in Microsoft .NET, it is indeed caused by an access violation. – Botz3000 Jun 28 '12 at 07:23
  • 2
    Do you have a clue how the CLR implements _anything_? Do you need to? – H H Jun 28 '12 at 07:37
  • If you think you need to know then you're on the wrong track. – H H Jun 28 '12 at 07:47
  • Have you read [CLR via C#](http://www.amazon.co.uk/CLR-via-3rd-Jeffrey-Richter/dp/0735627045)? You might find it interesting. – AakashM Jun 28 '12 at 08:10
  • @leppie `callvirt` is part of the answer, since `callvirt` will always raise it on null, but so will field access whether from the outside or within a method called by `call` (for which the `this` pointer would be null). – Jon Hanna Jun 28 '12 at 08:33
  • @HenkHolterman: Would pave me the right path? – Sumeet Jun 28 '12 at 08:38
  • @AakashM: I have started referring it, few days back and started grasping few things and thus fired up with this question in my mind. – Sumeet Jun 28 '12 at 08:39
  • @HenkHolterman if we restricted what we learnt to things we could identify a need for learning at the time, we wouldn't know most of the things we do benefit from knowing. – Jon Hanna Jun 28 '12 at 09:44
  • 1
    Indeed, one practical lesson here, is that the fact that all method calls and field access are checked for null, is implemented in a very efficient manner that is based on something the OS does for all memory access, and one therefore shouldn't code convoluted attempts to avoid them. – Jon Hanna Jun 28 '12 at 10:10

3 Answers3

11

It doesn't have to be (there could be explicit checks), but it works from trapping access violation exceptions.

A .NET object will be turned into a native object: Its fields become a block of memory laid out in a particular manner, its methods are jitted into native machine code methods, and a v-table or other virtual method overload mechanism is created.

  1. Accessing a field then, means finding the address of the object, adding on the offset of the member, and reading or writing the piece of memory referred to.

  2. Calling a virtual method, means finding the address of the object, finding its method table (set offset within object), finding the method's address (set offset within the table) and calling the method at that address with the address of the object being passed (the this pointer).

  3. Calling a non-virtual method, means calling the method with the address of the object passed (the this pointer).

Clearly if there is not an actual object at the address in question cases 1 and 2 will go wrong in some way, while case 3 will work (but could in turn lead to case 1 or 2). There's two main ways this can go wrong:

  1. It could access an arbitrary bit of memory that is not really an object of our type, leading to all sorts of exciting and really hard to trace bugs (.NET code generally won't result in anything that causes this scenario).

  2. It could access an arbitrary bit of memory that is protected, leading to an access violation.

You may know about the second case from C, C++ or ASM coding. If not, you'll probably still have seen a program crash and with its dying breath talk about an access violation at some address. If so, you may have noticed that while the address given could be just about anything, it'll most often be either 0x00000000 or something very low like 0x00000020. Those were caused by code trying to dereference a null pointer whether to access a field or call a virtual method (which is essentially accessing a field and then calling depending on what you get).

Now, since the first 64k or memory is always protected, dereferencing a null pointer will always result in the second case (access violation) rather than the first case (arbitrary memory being mis-used and resulting in bizarre "fandango on the core" bugs).

This is all exactly the same with .NET (or rather, with the jitted code produced by it), but if (A) the access violation happened at an address lower than 0x00010000 and (B) such a violation is found to have happened by code that was jitted, then it is turned into a NullReferenceException, otherwise it gets turned into an AccessViolationException.

We can simulate this with code that doesn't dereference, but which does access protected memory (we'll only read, so if we should happen to accidentally hit memory that isn't protected, the result won't be too weird!):

The following code will raise an AccessViolationException:

unsafe
{
  int read = *((int*)long.MaxValue - 8);
}

The following code will raise a NullReferenceException:

unsafe
{
  int read = *((int*)8);
}

Neither code is actually dereferencing anything. Both cause access violations, but the CLR assumes the later was probably caused by a null reference (in fairness, by far the most likely scenario) and raises it.

So, we can see how field access and callvirt can cause this.

It's worth noting now that because of a decision to not allow C# to call methods on null references even when safe to do so, callvirt is used as the IL for the majority of cases in C#, and the only exceptions would be cases of static methods or where it can be shown at compile time to not be on a null reference. (Edit: There are a few other cases where the compiler can see that a callvirt can be replaced by a call, even when the method actually is virtual [if the compiler can tell which overload would be hit] and the later compilers will do this slightly more often, though it will still use callvirt more often than you might imagine).

An interesting case is where optimisation means that a method called with callvirt could be inlined, but where it isn't known at compile-time to be guaranteed non-null. In such a case a field access may be added before the place where where the "call" (that isn't really a call) happens, precisely to trigger the NullReferenceException at the start, rather than in the middle, of the method. This means the optimisation does not change the observed behaviour.

Jon Hanna
  • 110,372
  • 10
  • 146
  • 251
  • If the fields of a class total more than 64K, would the compiler add any extra code to a field access to ensure the class reference wasn't null? Most classes wouldn't have 64K worth of fields, but using nested generics it's not hard to make large structs); a class with fields of such types could easily get to be over 64K. – supercat Aug 28 '12 at 15:29
  • @supercat Such a class would still have just one 4 or 8 byte pointer. In either case though, a quick test on a class of 16,000 decimal fields threw NullReferenceException. – Jon Hanna Aug 28 '12 at 15:49
4

The MS implementation, IIRC, does this via an access violation. Null is essentially a zero reference, and basically: they deliberately reserve that address space and leave this page unmapped. The memory access violation is raised at the CPU/OS level automatically (i.e. without needing extra code to do a null check), and the CLI then reports this as a null-reference exception.

Interestingly, because memory is handled in pages, you can actually simulate (if you try hard enough) a null-reference exception on a non-zero but low value, for the same reasons.

Edit: Eric Lippert discusses this on this related question/answer: https://stackoverflow.com/a/8681563

Community
  • 1
  • 1
Marc Gravell
  • 1,026,079
  • 266
  • 2,566
  • 2,900
  • does it mean the statement I have put in the original question/post is perfect ( in general).? – Sumeet Jun 28 '12 at 09:11
  • 1
    @Sumeet broadly speaking - I'm not sure I'd use the word "perfect"; use Eric's explanation when in doubt! Note also that this is an implementation detail and can change between versions, platforms, etc – Marc Gravell Jun 28 '12 at 09:44
  • In practice, it'll be much the same in other implementations and platforms. At a lower level .NET and Mono on Windows will be reacting to a STATUS_ACCESS_VIOLATION exception while Mono on Linux will be reacting to a SIGSEGV signal, but it amounts to the same thing. It *could* be all manner of things, but any OS with protected memory will allow an approach much like that in .NET – Jon Hanna Jun 28 '12 at 09:51
  • @Jon actually, I was thinking of CF, MF, etc; indeed, MF is an *interpreter*, so I *imagine* it will be doing a manual null-check. – Marc Gravell Jun 28 '12 at 09:58
  • True. I've not looked at it at all, so damn you for giving me more curiosity that will take up my time. An interpreter can still react to an exception happening in what it tries to do though, so it could be close to the .NET form, but with an extra layer. – Jon Hanna Jun 28 '12 at 10:02
  • @MarcGravell Link that you attached with your post is best explains.. Eric Lippert rocks! – Sumeet Jun 29 '12 at 07:00
1

Have you read the CLI Spec - ECMA-335? You will find some answers there.

11 Semantics of classes...When a variable or field that has a class as its type is created (for example, by calling a method that has a local variable of a class type), the value shall initially be null, a special value that := with all class types even though it is not an instance of any particular class.

And the description of the ldnull instruction:

The ldnull pushes a null reference (type O) on the stack. This is used to initialize locations before they become live or when they become dead. [Rationale: It might be thought that ldnull is redundant: why not use ldc.i4.0 or ldc.i8.0 instead? The answer is that ldnull provides a size-agnostic null – analogous to an ldc.i instruction, which does not exist. However, even if CIL were to include an ldc.i instruction it would still benefit verification algorithms to retain the ldnull instruction because it makes type tracking easier. end rationale] Verifiability: The ldnull instruction is always verifiable, and produces a value of the null type (§1.8.1.2) that is assignable-to (§I.8.7.3)any other reference type.

Ventsyslav Raikov
  • 6,882
  • 1
  • 25
  • 28
  • This says exactly nothing of "how" – Marc Gravell Jun 28 '12 at 08:59
  • @MarcGravell - this is "how" the CLR will identify the null reference. How exceptions are raised is a different story. – Ventsyslav Raikov Jun 28 '12 at 09:03
  • no, that is the *requirement for the behaviour*. That is *when* it is required to detect it. It isn't *how*. – Marc Gravell Jun 28 '12 at 09:08
  • yes - I agree, I read your link as well as this one - http://www.codeproject.com/Articles/20481/NET-Type-Internals-From-a-Microsoft-CLR-Perspecti it looks like you're right even though I imagine generating a non 0 pointer in the low 64K might be done only by using unsafe code. I thought the null reference generated by the ldnull instruction was directly compared with 'this' for example(in safe code of course this should be enough). – Ventsyslav Raikov Jun 28 '12 at 09:30
  • It could do that, but it doesn't (and it would be very slow to have explicit checks all over the place). See my answer for the how. – Jon Hanna Jun 28 '12 at 09:33
  • And of course, all of that is implementation specific to CLR and windows. – Ventsyslav Raikov Jun 28 '12 at 09:34
  • 2
    All "how" questions are implementation specific. That's what "implementation" means. – Jon Hanna Jun 28 '12 at 09:34