88

I am a complete novice to C, and during my university work I've come across comments in code that often refer to de-referencing a NULL pointer. I do have a background in C#, I've been getting by that this might be similar to a "NullReferenceException" that you get in .Net, but now I am having serious doubts.

Can someone please explain to me in layman's terms exactly what this is and why it is bad?

cigien
  • 57,834
  • 11
  • 73
  • 112
Ash
  • 24,276
  • 34
  • 107
  • 152
  • 1
    Keep in mind doing so results in undefined behavior. You don't get exceptions or anything, in C or C++. – GManNickG Oct 24 '10 at 07:09
  • 1
    You might want to put down some example code. It seems that people (including me) don't get what you are trying to ask. –  Oct 24 '10 at 07:24
  • 3
    No need for code (there isn't any) - This is a conceptual problem I am having, trying to get my head around the terminology of "dereferencing" and why I should be caring about it. – Ash Oct 26 '10 at 04:22
  • 7
    https://www.youtube.com/watch?v=bLHL75H_VEM – Veer Singh Jan 07 '16 at 02:16

8 Answers8

109

A NULL pointer points to memory that doesn't exist. This may be address 0x00000000 or any other implementation-defined value (as long as it can never be a real address). Dereferencing it means trying to access whatever is pointed to by the pointer. The * operator is the dereferencing operator:

int a, b, c; // some integers
int *pi;     // a pointer to an integer

a = 5;
pi = &a; // pi points to a
b = *pi; // b is now 5
pi = NULL;
c = *pi; // this is a NULL pointer dereference

This is exactly the same thing as a NullReferenceException in C#, except that pointers in C can point to any data object, even elements inside an array.

Greg Hewgill
  • 951,095
  • 183
  • 1,149
  • 1,285
  • 18
    @Ash: A pointer contains a memory address that *references* to something. To access that something *referenced* by that memory address, you have to *de-reference* the memory address. – In silico Oct 24 '10 at 05:14
  • 1
    @Ash, which In silico said, but when you de-reference you getting the value that is stored at the memory address. Give it a try. Do int p; printf("%p\n", &p); it should print out an address. When you don't create a pointer (*var), to get the address you use &var – Matt Oct 24 '10 at 16:21
  • 1
    @Greg How about when you do `char *foo = NULL` and then use &foo ? – Bionix1441 Feb 22 '17 at 09:45
  • 1
    @Bionix1441: In your example `&foo` refers to the address of the variable called `foo`, which is fine given its declaration. – Greg Hewgill Feb 22 '17 at 16:55
  • 1
    I think it's important to also state that it's undefined behavior, assuming that Adam Rosenfield's answer is correct. – axel22 Jan 26 '20 at 17:22
  • 1
    Address 0x0 exist: "In rare circumstances, when NULL is equivalent to the 0x0 memory address and privileged code can access it, then writing or reading memory is possible, which may lead to code execution.", from https://cwe.mitre.org/data/definitions/476.html – baz Jul 21 '21 at 09:40
47

Dereferencing just means accessing the memory value at a given address. So when you have a pointer to something, to dereference the pointer means to read or write the data that the pointer points to.

In C, the unary * operator is the dereferencing operator. If x is a pointer, then *x is what x points to. The unary & operator is the address-of operator. If x is anything, then &x is the address at which x is stored in memory. The * and & operators are inverses of each other: if x is any data, and y is any pointer, then these equations are always true:

*(&x) == x
&(*y) == y

A null pointer is a pointer that does not point to any valid data (but it is not the only such pointer). The C standard says that it is undefined behavior to dereference a null pointer. This means that absolutely anything could happen: the program could crash, it could continue working silently, or it could erase your hard drive (although that's rather unlikely).

In most implementations, you will get a "segmentation fault" or "access violation" if you try to do so, which will almost always result in your program being terminated by the operating system. Here's one way a null pointer could be dereferenced:

int *x = NULL;  // x is a null pointer
int y = *x;     // CRASH: dereference x, trying to read it
*x = 0;         // CRASH: dereference x, trying to write it

And yes, dereferencing a null pointer is pretty much exactly like a NullReferenceException in C# (or a NullPointerException in Java), except that the langauge standard is a little more helpful here. In C#, dereferencing a null reference has well-defined behavior: it always throws a NullReferenceException. There's no way that your program could continue working silently or erase your hard drive like in C (unless there's a bug in the language runtime, but again that's incredibly unlikely as well).

Adam Rosenfield
  • 390,455
  • 97
  • 512
  • 589
  • 2
    Actually, a `NULL` pointer does sometimes point to valid data. Many microcontrollers implement flash/peripherals at the address 0. – Sapphire_Brick Jul 19 '20 at 06:10
3

It means

myclass *p = NULL;
*p = ...;  // illegal: dereferencing NULL pointer
... = *p;  // illegal: dereferencing NULL pointer
p->meth(); // illegal: equivalent to (*p).meth(), which is dereferencing NULL pointer

myclass *p = /* some legal, non-NULL pointer */;
*p = ...;  // Ok
... = *p;  // Ok
p->meth(); // Ok, if myclass::meth() exists

basically, almost anything involving (*p) or implicitly involving (*p), e.g. p->... which is a shorthand for (*p). ...; except for pointer declaration.

Lie Ryan
  • 62,238
  • 13
  • 100
  • 144
1

From wiki

A null pointer has a reserved value, often but not necessarily the value zero, indicating that it refers to no object
..

Since a null-valued pointer does not refer to a meaningful object, an attempt to dereference a null pointer usually causes a run-time error.

int val =1;
int *p = NULL;
*p = val; // Whooosh!!!! 
Prasoon Saurav
  • 91,295
  • 49
  • 239
  • 345
1

Quoting from wikipedia:

A pointer references a location in memory, and obtaining the value at the location a pointer refers to is known as dereferencing the pointer.

Dereferencing is done by applying the unary * operator on the pointer.

int x = 5;
int * p;      // pointer declaration
p = &x;       // pointer assignment
*p = 7;       // pointer dereferencing, example 1
int y = *p;   // pointer dereferencing, example 2

"Dereferencing a NULL pointer" means performing *p when the p is NULL

Arun
  • 19,750
  • 10
  • 51
  • 60
1

A NULL pointer points to memory that doesn't exist, and will raise Segmentation fault. There's an easier way to de-reference a NULL pointer, take a look.

int main(int argc, char const *argv[])
{
    *(int *)0 = 0; // Segmentation fault (core dumped)
    return 0;
}

Since 0 is never a valid pointer value, a fault occurs.

SIGSEGV {si_signo=SIGSEGV, si_code=SEGV_MAPERR, si_addr=NULL}
vmemmap
  • 510
  • 4
  • 20
  • `(int *)0` is not strictly speaking a null pointer constant, so this code could just as well be application code writing to address zero, not to be confused with null pointers. There is nothing in your example formally guaranteeing a null-pointer access nor a seg fault. In order to create a null pointer, you need to assign a null pointer constant to a pointer. Null pointer constant being either `0` or `(void*)0`. The `NULL` macro can be either of these. – Lundin May 11 '22 at 09:47
  • @Lundin - IIRC, `(void *)0` is a valid null pointer constant, so it's not differ than `(int *)0`. The code assigns a zero value to an integer pointer then dereferences it. – vmemmap Jun 24 '22 at 10:28
  • 1
    `(void *)0` is a valid null pointer constant because the standard explicitly says so. It says nothing about `(int*)0`. – Lundin Jun 27 '22 at 09:06
1

Lots of confusion and confused answers here. First of all, there is strictly speaking nothing called a "NULL pointer". There are null pointers, null pointer constants and the NULL macro.

Start by studying my answer from Codidact: What's the difference between null pointers and NULL? Quoting some parts of it here:

There are three different, related concepts that are easy to mix up:

  • null pointers
  • null pointer constants
  • the NULL macro

Formal definitions

The first two of these terms are formally defined in C17 6.3.2.3/3:

An integer constant expression with the value 0, or such an expression cast to type void *, is called a null pointer constant.67) If a null pointer constant is converted to a pointer type, the resulting pointer, called a null pointer, is guaranteed to compare unequal to a pointer to any object or function.

In other words, a null pointer is a pointer of any type pointing at a well-defined "nowhere". Any pointer can turn into a null pointer when it is assigned a null pointer constant.

The standard mentions 0 and (void*)0 as two valid null pointer constants, but note that it says "an integer constant expression with the value 0". This means that things like 0u, 0x00 and other variations are also null pointer constants. These are particular special cases that can be assigned to any pointer type, regardless of the various type compatibility rules that would normally apply.

Notably, both object pointers and function pointers can be null pointers. Meaning that we must be able to assign null pointer constants to them, no matter the actual pointer type.


NULL

The note 67) from above adds (not normative):

67) The macro NULL is defined in <stddef.h> (and other headers) as a null pointer constant; see 7.19.

where 7.19 simply defines NULL as (normative):

NULL which expands to an implementation-defined null pointer constant;

In theory this could perhaps be something other than 0 and (void*)0, but the implementation-defined part is more likely saying that NULL can either be #define NULL 0 or #define NULL (void*)0 or some other integer constant expression with the value zero, depending on the C library used. But all we need to know and care about is that NULL is a null pointer constant.

NULL is also the preferred null pointer constant to use in C code, because it is self-documenting and unambiguous (unlike 0). It should only be used together with pointers and not for any other purpose.


Additionally, do not mix this up with "null termination of strings", which is an entirely separate topic. Null termination of strings is just a value zero, often referred to either as nul (one L) or '\0' (an octal escape sequence), just to separate it from null pointers and NULL.


Dereferencing

Having cleared that out, we cannot access what a null pointer points at, because it is as mentioned a well-defined "nowhere". The process of accessing what a pointer points at is known as dereferencing, and is done in C (and C++) through the unary * indirection operator. The C standard specifying how this operator works simply states (C17 6.5.3.3):

If an invalid value has been assigned to the pointer, the behavior of the unary * operator is undefined

Where an informative note adds:

Among the invalid values for dereferencing a pointer by the unary * operator are a null pointer, an address inappropriately aligned for the type of object pointed to, and the address of an object after the end of its lifetime.

And this would be where "segmentation faults" or "null pointer/reference exceptions" might be thrown. The reason for such is almost always an application bug such as these examples:

int* a = NULL; // create a null pointer by initializing with a null pointer constant
*a = 1;        // null pointer is dereferenced, undefined behavior

int* b = 0;    // create a null pointer by initializing with a null pointer constant
               // not to be confused with similar looking dereferencing and assignment:
*b = 0;        // null pointer is dereferenced, undefined behavior
Lundin
  • 195,001
  • 40
  • 254
  • 396
  • In some cases, it's not quite clear whether an expression "dereferences" a pointer. For example, an expression like `(uintptr_t)&(structPtr->member)` could be evaluated without performing any access to the pointer in question, and even if an implementation that would usefully trap pointer arithmetic involving null pointers in most contexts, it could recognize that an expression like the above is ultimately forming an integer rather than a pointer. – supercat May 23 '22 at 15:06
-1

Let's look at an example of dereferencing a NULL pointer, and talk about it.

Here is an example of dereferencing a NULL pointer, from this duplicate question here: uint32_t *ptr = NULL;:

int main (void) 
{
    uint32_t *ptr = NULL;
    
    // `*ptr` dereferences the NULL ptr
    *ptr = 0;
    
    return 0;
}

Memory hasn't been allocated for the uint32_t, so calling *ptr, which "dereferences" the pointer, ptr, or otherwise said: accesses memory at an unallocated (NULL--usually 0, but implementation-defined) address, is illegal. It is "undefined behavior"--ie: a bug.

So, you should statically (preferred, where possible), or dynamically allocate space for a uint32_t and then only dereference a pointer which points to valid memory, as follows.

Here is how to statically allocate memory and use it with a pointer. Note even that the memory for the pointer itself is statically allocated in my example!:

// allocate enough memory for a 4-byte (32-bit) variable
uint32_t variable;

// allocate enough memory for a pointer, which is **usually** 2 bytes on an
// 8-bit microcontroller such as Arduino, or usually 4 bytes on a 32-bit
// architecture, or usually 8 bytes on a 64-bit Linux computer, for example 
uint32_t* ptr;

// assign the address of `variable` to the pointer; you can now say that
// `ptr` "points to" the variable named `variable`; in literal terms, `ptr` now
// contains the numerical value of the address of the first byte of the
// variable `variable`
ptr = &variable;

// Store a number into the 4-byte variable named `variable`, via a pointer to it
*ptr = 1234;
// OR, same exact thing as just above: store a number into that 4-byte
// variable, but this time via the variable name, `variable`, directly
variable = 1234;

Note, dynamic allocation is fine too, but static memory allocation is safer, deterministic, faster, better for memory-constrained embedded systems, blah blah blah. The point is simply that you cannot legally dereference any pointer (meaning: put an asterisk "dereference operator" in front of it, like *ptr) which does not point to a chunk of allocated memory. I generally allocate memory statically by declaring a variable.

Gabriel Staples
  • 36,492
  • 15
  • 194
  • 265