Evaluating the condition containing unitialized pointer - UB, but can it crash?

Question

Somewhere on the forums I encountered this:

Any attempt to evaluate an uninitialized pointer variable
invokes undefined behavior. For example:

int *ptr; /* uninitialized */
if (ptr == NULL) ...; /* undefined behavior */

What is meant here? Is it meant that if I ONLY write:

if(ptr==NULL){int t;};

this statement is already UB? Why? I am not dereferencing the pointer right? (I noticed there maybe terminology issue, by UB in this case, I referred to: will my code crash JUST due to the if check?)

what do you expect will be the resut of `if (ptr == NULL)` statement ? `true` or `false` ? maybe `true`, maybe `false` => undefined — Dabo, Mar 17 '14 at 21:08
but `ptr` is uninitialized, and there is some chanse it will be `NULL` as there is a chanse it won't be `NULL`. It won't cause programm to crash, but the behavior is undefined — Dabo, Mar 17 '14 at 21:22
It is UB because the standard says that it is UB. You might not be able to think of a reason why it should be UB, but the standard says so, so there's not much point arguing about it. If you don't like it, you are welcome to write in some other programming language whose standard you agree with. — Raymond Chen, Mar 17 '14 at 21:57
The is no way to check if a pointer is valid. You need to keep track yourself whether a variable contains a valid value. — Raymond Chen, Mar 18 '14 at 15:55
When optimizing compilers detect undefined behaviour like this, they can to assume that the code is unreachable and then proceed to eliminate the codepaths leading to that UB. — CodesInChaos, Mar 02 '15 at 10:50

LihO · Answer 1 · 2014-03-17T23:06:03.440

3

Using unitialized variables invokes undefined behavior. It doesn't matter whether it is pointer or not.

int i;
int j = 7 * i;

is undefined as well. Note that "undefined" means that anything can happen, including a possibility that it will work as expected.

In your case:

int *ptr;
if (ptr == NULL) { int i = 0; /* this line does nothing at all */ }

ptr might contain anything, it can be some random trash, but it can be NULL too. This code will most likely not crash since you are just comparing value of ptr to NULL. We don't know if the execution enters the condition's body or not, we can't be even sure that some value will be successfully read - and therefore, the behavior is undefined.

edited Mar 17 '14 at 23:06

answered Mar 17 '14 at 21:08

LihO

41,190
11
99
167

Ok I think this maybe terminology issue: Asked in other way will the sample code I provided crash? Only because of the 'ptr==NULL' check? – Mar 17 '14 at 21:12
Yes that was my point I am just comparing ptr to NULL so I don't expect it to crash (seems its a terminology thing for some reasons UB I made synonymous with a crash in this case) – Mar 17 '14 at 21:16
*The only problem is that we don't know if the execution enters the condition's body or not* Although this is likely to be true in practice, as far as the standard is concerned, no, this is not the problem - the problem is that an uninitialized variable is being used for something other than writing to it - so yes, theoretically it can crash. As mentioned, *anything* (including working as expected or crashing) can happen. – Filipe Gonçalves Mar 17 '14 at 21:36
@FilipeGonçalves: This specific piece of code is just accessing the value stored within a memory that belongs to you. And although the value is not known, this can not yield crash. – LihO Mar 17 '14 at 21:39
@LihO: I also thought like that .. :( – Mar 17 '14 at 21:48
3

Accessing an uninitialized value can result in a crash on platforms where registers are tagged or are special-purpose. For example, on a segmented architecture, loading an invalid segment register raises an exception *even if you don't use the segment register to access memory*. More generally, somebody might write a "super strict" C compiler that puts special values in uninitialized variables and raises a runtime error if that special value is ever loaded. – Raymond Chen Mar 17 '14 at 21:59
Indeed, *"The only problem is that we don't know if the execution enters the condition's body or not"*, is not the only problem. We don't know if it's legal to compare the value the uninitialized pointer has (unless we specify CPU and memory model where all values are legal for comparison). – hyde Mar 17 '14 at 22:23
@RaymondChen: Fair enough. Although the situation that you describe would be solely implementation dependent. – LihO Mar 17 '14 at 22:32
@hyde: I've changed the wording so that it's not misleading. – LihO Mar 17 '14 at 22:33
All UB is implementation-dependent. That's sort of the point of UB. It tells the implementation "You are allowed to do anything you want when this happens." – Raymond Chen Mar 17 '14 at 23:01

Chris Maes · Answer 2 · 2014-03-17T21:15:24.817

2

your pointer is not initialized. Your statement would be the same as:

int a;
if (a == 3){int t;}

since a is not initialized; its value can be anything so you have undefined behavior. It doesn't matter whether you dereference your pointer or not. If you would do that, you would get a segfault

edited Mar 17 '14 at 21:15

answered Mar 17 '14 at 21:08

Chris Maes

35,025
12
111
136

Shafik Yaghmour · Answer 3 · 2014-03-17T21:32:28.367

2

The C99 draft standard says it is undefined clearly in Annex J.2 Undefined behavior:

The value of an object with automatic storage duration is used while it is indeterminate (6.2.4, 6.7.8, 6.8).

and the normative text has an example that also says the same thing in section 6.5.2.5 Compound literals paragraph 17 which says:

Note that if an iteration statement were used instead of an explicit goto and a labeled statement, the lifetime of the unnamed object would be the body of the loop only, and on entry next time around p would have an indeterminate value, which would result in undefined behavior.

and the draft standard defines undefined behavior as:

behavior, upon use of a nonportable or erroneous program construct or of erroneous data, for which this International Standard imposes no requirements

and notes that:

Possible undefined behavior ranges from ignoring the situation completely with unpredictable results, to behaving during translation or program execution in a documented manner characteristic of the environment (with or without the issuance of a diagnostic message), to terminating a translation or execution (with the issuance of a diagnostic message).

edited Mar 17 '14 at 21:32

answered Mar 17 '14 at 21:17

Shafik Yaghmour

154,301
39
440
740

it seems I am having problem with what is "undefined behavior" - but the code will not crash right? Just because of the if(ptr==NULL) check? – Mar 17 '14 at 21:18
@dmcr_code UB means it is `unpredictable results`, let me quote the standard definition. – Shafik Yaghmour Mar 17 '14 at 21:19
@dmcr_code if you read some of John Regehr's work such as [Finding Undefined Behavior Bugs by Finding Dead Code](http://blog.regehr.org/archives/970) you will see it can do much worse things than crash. Compilers can optimize away code that it knows has UB with very bad results in some cases. – Shafik Yaghmour Mar 17 '14 at 21:23
I find it weird that just checking value of the pointer - without dereferencing it, already may crash a program ... :/ – Mar 17 '14 at 21:25
@dmcr_code it probably won't crash but in many ways that is worse, at least when it crashes you know something is wrong. – Shafik Yaghmour Mar 17 '14 at 21:29
but I've seen often people doing checks like: if(!p) then they return from function, or try not to dereference such pointers... hmm I am confused... How do you check if pointer is valid then? – Mar 17 '14 at 21:35
@dmcr_code You can use `p` without restrictions after you initialize it. Before initializing it, the only valid operation is to assign to it - anything other than that invokes undefined behavior, period. – Filipe Gonçalves Mar 17 '14 at 21:38
@FilipeGonçalves: Assigning you mean if I do int*p=0; then I can check if(p==NULL) and it won't be UB? – Mar 17 '14 at 21:47

score 1 · Answer 4 · edited May 23 '17 at 11:51

As Shafik has pointed out, the C99 standard draft declares any use of unintialized variables with automatic storage duration undefined behaviour. That amazes me, but that's how it is. My rationale for pointer use comes below, but similar reasons must be true for other types as well.

After int *pi; if (pi == NULL){} your prog is allowed to do arbitrary things. In reality, on PCs, nothing will happen. But there are architectures out there which have illegal address values, much like NaN floats, which will cause a hardware trap when they are loaded in a register. These to us modern PC users unheard of architectures are the reason for this provision. Cf. e.g. How does a hardware trap in a three-past-the-end pointer happen even if the pointer is never dereferenced?.

Kenneth Wilke · Answer 5 · 2014-03-17T22:05:15.173

The behavior of this is undefined because of how the stack is used for various function calls. When a function is called the stack grows to make space for variables within the scope of that function, but this memory space is not cleared or zeroed out.

This can be shown to be unpredictable in code like the following:

#include <stdio.h>

void test()
{
    int *ptr;
    printf("ptr is %p\n", ptr);
}

void another_test()
{
    test();
}

int main()
{
    test();
    test();
    another_test();
    test();
    return 0;
}

This simply calls the test() function multiple times, which just prints where 'ptr' lives in memory. You'd expect maybe to get the same results each time, but as the stack is manipulated the physical location of where 'ptr' is has changed and the data at that address is unknown in advance.

On my machine running this program results in this output:

ptr is 0x400490
ptr is 0x400490
ptr is 0x400575
ptr is 0x400585

To explore this a bit more, consider the possible security implications of using pointers that you have not intentionally set yourself

#include <stdio.h>

void test()
{
    int *ptr;
    printf("ptr is %p\n", ptr);
}

void something_different()
{
    int *not_ptr_or_is_it = (int*)0xdeadbeef;
}

int main()
{
    test();
    test();
    something_different();
    test();
    return 0;
}

This results in something that is undefined even though it is predictable. It is undefined because on some machines this will work the same and others it might not work at all, it's part of the magic that happens when your C code is converted to machine code

ptr is 0x400490
ptr is 0x400490
ptr is 0xdeadbeef

score 0 · Answer 6 · answered Aug 03 '16 at 17:20

Some implementations may be designed in such a way that an attempted rvalue conversion of an invalid pointer may case arbitrary behavior. Other implementations are designed in such a way that an attempt to compare any pointer object with null will never do anything other than yield 0 or 1.

Most implementations target hardware where pointer comparisons simply compare bits without regard for whether those bits represent valid pointers. The authors of many such implementations have historically considered it so obvious that a pointer comparison on such hardware should never have any side-effect other than to report that pointers are equal or report that they are unequal that they seldom bothered to explicitly document such behavior.

Unfortunately, it has become fashionable for implementations to aggressively "optimize" Undefined Behavior by identifying inputs that would cause a program to invoke UB, assuming such inputs cannot occur, and then eliminating any code that would be irrelevant if such inputs were never received. The "modern" viewpoint is that because the authors of the Standard refrained from requiring side-effect-free comparisons on implementations where such a requirement would impose significant expense, there's no reason compilers for any platform should guarantee them.

score -2 · Answer 7 · edited May 23 '17 at 12:08

-2

You're not dereferencing the pointer, so you don't end up with a segfault. It will not crash. I don't understand why anyone thinks that comparing two numbers will crash. It's nonsense. So again:

IT WILL NOT CRASH. PERIOD.

But it's still UB. You don't know what memory address the pointer contains. It may or may not be NULL. So your condition if (ptr == NULL) may or may not evaluate to true.

Back to my IT WILL NOT CRASH statement. I've just tested the pointer going from 0 to 0xFFFFFFFF on the 32-bit x86 and ARMv6 platforms. It did not crash.

I've also tested the 0..0xFFFFFFFF and 0xFFFFFFFF00000000..0xFFFFFFFFFFFFFFFF ranges on and amd64 platform. Checking the full range would take a few thousand years I guess.
Again, it did not crash.

I challenge the commenters and downvoters to show a platform and value where it crashes. Until then, I'll probably be able to survive a few negative points.

There is also a SO link to trap representation which also indicates that it will not crash.

edited May 23 '17 at 12:08

Community

1
1

answered Mar 17 '14 at 21:09

SzG

12,333
4
28
41

"may or may not evaluate to true" yes but will it crash? just because of the check? – Mar 17 '14 at 21:15
No, it will not crash. That's what my first sentence is about. – SzG Mar 17 '14 at 21:19
There are 7 responses. Comparison in C is very straightforward, as it's just a glorified assembler: it compares `ptr` with `NULL` bit-by-bit. I can't imagine how that could cause a crash. – SzG Mar 18 '14 at 07:38
Failed speculative execution on the Itanium processor causes registers to be tagged as invalid. Attempting to use an invalid register raises a trap (which will cause your process to crash). – Raymond Chen Feb 14 '15 at 14:29

Evaluating the condition containing unitialized pointer - UB, but can it crash?

7 Answers7

Linked