Why init pointer to NULL is a good practice?

Question

In case you don't want (or can not) init a pointer with an address, I often hear people say that you should init it with NULL and it's a good practice.

You can find people also say something like that in SO, for example here.

Working in many C projects, I don't think it is a good practice or at least somehow better than not init the pointer with anything.

One of my biggest reason is: init a pointer with NULL increase the chance of null pointer derefence which may crash the whole software, and it's terrible.

So, could you tell me what are the reasons if you say that it is a good pratice or people just say it for granted (just like you should always init an variable) ?

Note that, I tried to find in Misra 2004 and also did not find any rule or recommendation for that.

Update:

So most of the comments and answers give the main reason that the pointer could have an random address before being used, so it's better to have it as null so you could figure out the problem faster.

To make my point clearer, I think that it doesn't make senses nowadays in commercial C softwares, in a practical point of view. The unassigned pointer that is used will be detected right the way by most of static analyzer, that's why I prefer to let it be un-init because if I init it with NULL then when developers forget to assign it to a "real" address, it passes the static analyzer and will cause runtime problems with null pointer.

One of the very first portability bugs I ever fixed was a program that never initialized a certain pointer -- but on the platform it was initially developed on (Sparc, I think), the relevant value just happened to be left on the stack at the right place by the compiler by something else that was running before each invocation. Move to a different architecture, get a different stack layout, *boom*. — Charles Duffy, Aug 09 '19 at 21:50
Wherever possible I avoid initialising/assigning a pointer to `NULL` by not defining it until needed (e.g. when allocating memory for it to point at) and ensuring it goes out of scope when no longer needed (e.g. when memory it points at is explicitly released). Simple logic: a pointer cannot, accidentally or otherwise, be misused before or after it exists. Too many programmers DEFAULT to allowing pointers to exist beyond their useful life, and then use error-prone workarounds - like setting to and checking for `NULL` before reusing - in the hope of preventing accidental misuse — Peter, Aug 10 '19 at 02:27
How about that it's undefined behavior twice over if you don't initialize a pointer and then use it instead of once-over if you initialize it to NULL and then use it? — S.S. Anne, Aug 10 '19 at 18:28

score 8 · Answer 1 · answered Aug 09 '19 at 20:08

8

You said

One of my biggest reason is: init a pointer with NULL increase the chance of null pointer derefence which may crash the whole software, and it's terrible.

I would argue the main reason is actually due to exactly this. If you don't init pointers to NULL, then if there is a dereferecing error it's going to be a lot harder to find the problem because the pointer is not going to be set to NULL, it's going to be a most likely garbage value that may look exactly like a valid pointer.

answered Aug 09 '19 at 20:08

dangee1705

3,445
1
21
40

However, if you don't init to NULL, chances are VERY high that it will be NULL anyway. And NOT initializing to NULL increases the chance of getting a warning from the compiler. – klutt Jan 27 '21 at 12:48
Whether or not it's automatically set to null is probably implementation dependant so it's probably best just to set it – dangee1705 Jan 27 '21 at 13:08
I'm not too sure that it's best. `int *p; *p=42;` will generate a compiler warning while `int *p=NULL; *p=42;` will not. – klutt Jan 27 '21 at 13:16

Clifford · Answer 2 · 2022-02-01T17:53:43.537

7

C has very little runtime error checking, but NULL is guaranteed not to refer to a valid address, so a runtime environment (typically an operating system) is able to trap any attempt to de-refernce a NULL. The trap will identify the point at which the de-reference occurs rather then the point the program may eventually fail, making identification of the bug far easier.

Moreover when debugging, and unitialised pointer with random content may not be easily distinguishable from a valid pointer - it may refer to a plausible address, whereas NULL is always an invalid address.

If you de-reference an uninitialised pointer the result is non-deterministic - it may crash, it may not, but it will still be wrong.

If it does crash you cannot tell how or even when, since it may result in corruption of data, or reading of invalid data that may have no effect until that corrupted data is later used. The point of failure will not necessarily be the point of error.

So the purpose is that you will get deterministic failure, whereas without initialising, anything could happen - including nothing, leaving you with a latent undetected bug.

One of my biggest reason is: init a pointer with NULL increase the chance of null pointer derefence which may crash the whole software, and it's terrible.

Deterministic failure is not "terrible" - it increases your chance of finding the error during development, rather then having your users finding the error after deployment. What you are effectively suggesting is that it is better to leave the bugs in and hide them. The dereference on null is guaranteed to be trapped, de-referencing an unitialised pointer is not.

That said initialising with null, should only be done if at the point of declaration you cannot directly assign an otherwise valid value. That is to say, for example:

char* x = malloc( y ) ;

is much preferable to:

char* x = NULL ;

...

x = malloc( y ) ;

which is in turn preferable to:

char* x ;

...

x = malloc( y ) ;

Note that, I tried to find in Misra 2004 and also did not find any rule or recommendation for that.

MISRA C:2004, 9.1 - All automatic variables shall have been assigned a value before being used.

That is to say, there is no guideline to initialise to NULL, simply that initialisation is required. As I said initialisation to NULL is not preferable to initialising to a valid pointer. Don't blindly follow the "must initialise to NULL advice", because the rule is simply "must initialise", and sometimes the appropriate initialisation value is NULL.

edited Feb 01 '22 at 17:53

answered Aug 09 '19 at 20:22

Clifford

88,407
13
85
165

I do not agree that the last snippet is preferable over the one before last. Consider there are several codepaths assigning `x`, and the programmer have missed one path, so it is not assigning it. So with the last approach a decent compiler will warn about the use of uninitialized variable. In the one before last it will not. Default values are not always desired. – Eugene Sh. Aug 09 '19 at 20:27
1

@EugeneSh. : However it is my experience that compilers generally only perform uninitialised variable detection statically when optimisation is applied, because it is then that the necessary _abstract execution_ analysis occurs. Even then only in simple cases - passing the null pointer to a different function in a different translation unit then de-referencing it will not be detected. Moreover since in debug/development you would generally not apply optimisation (at least not if you use a symbolic debugger), then on balance I'd still advocate the second over the third. – Clifford Aug 09 '19 at 20:42
1

*C has very little runtime error checking, but one thing it will trap is de-referencing a null pointer* Strictly speaking this is the OS'es doing, not C's. (And while it's pretty universally true today, it wasn't always. Back in the days of V7 Unix, and early BSD also, page 0 was mapped in to your address space, so accessing null pointers was fine -- it was only *writing* there that caused crashes. And there was even code -- utterly misguided, to be sure -- that tested for null pointers by doing `if(strcmp(p, "\7\1") == 0), because `0x0107` was known to be the bit pattern at location 0...) – Steve Summit Aug 09 '19 at 20:43
I find this answer a bit troubling. It looks like you are implying that dereferencing a null pointer has a determined behavior. It doesn't. The standard specifically says it's UB. – klutt Feb 01 '22 at 11:04
@klutt I think it is pretty clear that there is no guarantee - in the first sentence for example - I am not sure I am implying anything is deterministic. In answer to the question, you do it because it provides an _opportunity_ for trapping the error. That is all I am saying. If you do not initialise, anything can happen, if you do initialise, then the platform _may_ provide deterministic behaviour even if the language does not guarantee it. That is no different that seg-fault or divide by zero behaviour - the language makes no promises, the platform might. – Clifford Feb 01 '22 at 11:28
@Clifford I think that *"If you de-reference an uninitialised pointer the result is non-deterministic"* is a pretty implicit way of saying *"If you de-reference an initialised pointer the result is deterministic"*. I know it does not mean this logically, but that is how such sentences usually are perceived. I think you should add something that explicitly says that this relies on how modern operating systems handles dereferencing of null pointers, but that the C standard explicitly says that it's UB. – klutt Feb 01 '22 at 11:34
@Clifford I understand that it is not your intention, but your explanation sounds like the good old excuse *"Technically, that's not what I said..."* ;) – klutt Feb 01 '22 at 11:36
I think Steve Summit's comment was fair and you can see that the line he quoted was changed. Perhaps it is not quite as emphatic as you might like, I accept that point and will consider changes. It does rather suggest that the C implementation and not the runtime environment is performing the suggested trap. On the whole (and you have to read the whole), I stand by this answer. – Clifford Feb 01 '22 at 17:43
There's a real-world analogy here which I think is instructive: In electrical systems, all pieces of metal which *don't* carry current are supposed to be well-grounded. This is so that, if a fault causes a live wire to come into contact with an unintended piece of metal, this will (a) create a short circuit which should (b) cause a circuit breaker to trip and therefore (c) *not* leave the piece of metal energized, ready to shock someone. – Steve Summit Mar 05 '23 at 20:17
(cont'd) This practice isn't *guaranteed* to work — there are faults it won't "catch" — but it helps, a lot. Similarly, setting pointers to NULL to ensure they can't be accidentally used helps a lot, too. – Steve Summit Mar 05 '23 at 20:17

Steve Summit · Answer 3 · 2019-08-10T07:52:13.983

One issue is that given a pointer variable p, there is no way defined by the C language to ask, "does this pointer point to valid memory or not?" The pointer might point to valid memory. It might point to memory that (once upon a time) was allocated by malloc, but that has since been freed (meaning that the pointer is invalid). It might be an uninitialized pointer, meaning that it's not even meaningful to ask where it points (although it is definitely invalid). But, again, there's no way to know.

So if you're bopping along in some far-off corner of a large program, and you want to say

if(p is valid) {
    do something with p;
} else {
    fprintf(stderr, "invalid pointer!\n");
}

you can't do this. Once again, the C language gives you no way of writing if(p is valid).

So that's where the rule to always initialize pointers to NULL comes in. If you adopt this rule and follow it faithfully, if you initialize every pointer to NULL or as a pointer to valid memory, if whenever you call free(p) you always immediately follow it with p = NULL;, then if you follow all these rules, you can achieve a decent way of asking "is p valid?", namely:

if(p != NULL) {
    do something with p;
} else {
    fprintf(stderr, "invalid pointer!\n");
}

And of course it's very common to use an abbreviation:

if(p) {
    do something with p;
} else {
    fprintf(stderr, "invalid pointer!\n");
}

Here most people would read if(p) as "if p is valid" or "if p is allocated".

Addendum: This answer has attracted some criticism, and I suppose that's because, to make a point, I wrote some unrealistic code which some people are reading more into than I'd intended. The idiom I'm advocating here is not so much valid pointers versus invalid pointers, but rather, pointers I have allocated versus pointers I have not allocated (yet). No one writes code that simply detects and prints "invalid pointer!" as if to say "I don't know where this pointer points; it might be uninitialized or stale". The more realistic way of using the idiom is do do something like

/* ensure allocation before proceeding */
if(p == NULL)
    p = malloc(...);

or

if(p == NULL) {
    /* nothing to do */
    return;
}

or

if(p == NULL) {
    fprintf(stderr, "null pointer detected\n");
    exit(0);
}

(And in all three cases the abbreviation if(!p) is popular as well.)

But, of course, if what you're trying to discriminate is pointers I have allocated versus pointers I have not allocated (yet), it is vital that you initialize all your un-allocated pointers with the explicit marker you're using to record that they're un-allocated, namely NULL.

This answer is dangerously wrong and answers like this have lead to large amounts of pain. The claim above is simply not true. The most obvious reason it's not true is because sometimes two pointers both point to the same object. So `free(p); p=NULL;` doesn't set the other pointer to NULL. Therefore, even if you follow all those rules, you cannot tell if a pointer points to something valid by comparing it to NULL. It is madness to suggest this! — David Schwartz, Aug 09 '19 at 20:44
@DavidSchwartz Well, yes, this rule only works if you've also licked the pointer aliasing problem. But if you *haven't* licked that problem, then you're going to have horrible problems whether you use the `if(p != NULL)` idiom or not! (For example, every time you say `p = realloc(p, newsize)`.) — Steve Summit, Aug 09 '19 at 21:01
@DavidSchwartz: That smells like a design problem to me, though - if you have multiple pointers to the same object, then one of two things needs to be true: 1) that object is persistent over the lifetime of the all the things pointing to it, or 2) there needs to be a way to keep the pointer values in sync - if one is updated, then they *all* need to be updated (or invalidated). But yeah, Steve makes the case a little too strongly - you can't definitively test that a pointer is valid, but you can definitively test if it's *invalid* if you're consistent about setting unused pointers to `NULL`. — John Bode, Aug 09 '19 at 21:01
@SteveSummit Right. It's a complex discipline and the answer above suggests that it's simple. Simply put, code has to know whether a pointer points to a valid object or not somehow, and sometimes whether or not it's NULL is the right way, but not even most of the time is that the best way. — David Schwartz, Aug 09 '19 at 21:32

score 0 · Answer 4 · answered Aug 09 '19 at 20:48

If you don't initialize the pointer, it can have any value, including possibly NULL. It's hard to imagine a scenario where having any value including NULL is preferable to definitely having a NULL value. The logic is that it's better to at least know its value than have its value unpredictably depend on who knows what, possibly resulting in the code behaving differently on different platforms, with different compilers, and so on.

I strongly disagree with any answer or argument based on the idea that you can reliably use a test for NULL to tell if a pointer is valid or not. You can set a pointer to NULL and then test it for NULL within a limited context where that is known to be safe. But there will always be contexts where more than one pointer points to the same thing and you cannot ensure that every possible pointer to an object will be set to NULL at the very place the object is freed. It is simply an essential C programming discipline to understand that a pointer may or may not point to a valid object depending on what is going on in the code.

If the code behaves differently because of this, then there is something seriously wrong with the code that an initialization will not fix. I totally agree with your second paragraph. — klutt, Feb 01 '22 at 11:44

John Bode · Answer 5 · 2019-08-09T22:54:36.040

One of my biggest reason is: init a pointer with NULL increase the chance of null pointer derefence which may crash the whole software, and it's terrible.

Which is why you add a check against NULL before using that pointer value:

if ( p ) // p != NULL
{
  // do something with p
}

NULL is a well-defined invalid pointer value, guaranteed to compare unequal to any object or function pointer value. It's a well-defined "nowhere" that's easy to check against.

Compare that to the indeterminate value that the uninitialized pointer¹ may have - most likely, it will also be an invalid pointer value that will lead to a crash as soon as you try to use it, but it's almost impossible to determine that beforehand. Is 0xfff78567abcd2220 a valid or invalid pointer value? How would you check that?

Obviously, you should do some analysis to see if an initialization is required. Is there a risk of that pointer being dereferenced before you assign a valid pointer value to it? If not, then you don't need to initialize it beforehand.

Since C99, the proper answer has been to defer instantiating a pointer (or any other type of object, really) until you have a valid value to initialize it with:

void foo( void )
{
  printf( "Gimme a length: " );
  int length;
  scanf( "%d", &length );
  char *buf = malloc( sizeof *buf * length );
  ...
}

ETA

I added a comment to Steve's answer that I think needs to be emphasized:

There's no way to determine if a pointer is valid - if you receive a pointer argument in a function like

void foo( int *ptr )
{
  ...
}

there is no test you can run on ptr to indicate that yes, it definitely points to an object within that object's lifetime and is safe to use.

By contrast, there is an easy, standard test to indicate that a pointer is definitely invalid and unsafe to use, and that's by checking that its value is NULL. So you can at least avoid using pointers that are definitely invalid with the

if ( p )
{
  // do something with p
}

idiom.

Now, just because p isn't NULL doesn't automatically mean it's valid, but if you're consistent and disciplined about setting unused pointers to NULL, then the odds are pretty high that it is.

This is one of those areas where C doesn't protect you, and you have to devote non-trivial amounts of effort to make sure your code is safe and robust. Frankly, it's a pain in the ass more often than not. But being disciplined with using NULL for inactive pointers makes things a little easier.

Again, you have to do some analysis and think about how pointers are being used in your code. If you know you're going to set a pointer to valid value before it's ever read, then it's not critical to initialize it to anything in particular. If you have multiple pointers pointing to the same object, then you need to make sure if that object ever goes away that all those pointers are updated appropriately.

^{This assumes that the pointer in question has auto storage duration - if the pointer is declared with the static keyword or at file scope, then it is implicitly initialized to NULL.}

Why init pointer to NULL is a good practice?

5 Answers5

Linked

Related