5

Fetching the value of an invalid pointer is an implementation defined behavior in C++ according to this. Now consider the following C program:

#include <stdio.h>
#include <stdlib.h>
int main(void)
{
    int* p=(int*)malloc(sizeof(int));
    *p=3;
    printf("%d\n",*p);
    printf("%p\n",(void*)p);
    free(p);
    printf("%p\n",(void*)p); // Is this undefined or implementation defined in behaviour C? 
}

But is the behaviour same in C also? Is the behaviour of the above C program undefined or implementation defined? What does the C99/C11 standard say about this? Please tell me if the behaviour is different in C99 & C11.

Community
  • 1
  • 1
Destructor
  • 14,123
  • 11
  • 61
  • 126
  • 3
    First in C you should not cast value returned by `malloc`. For your question I don't know if it is specified or not but `free` dont change the content of `p`. It is always valid to read (or change of course) the content of it, as for any variable (that was initialized before, of course, which is the case here). The only forbiden action is to dereference its value as it was freed. – hexasoft Nov 07 '15 at 16:20
  • It is not undefined. `p` still has a valid address of its own, it simply points to nothing. You will generally use `p = NULL`; before using (or allocating) with `p` again. but you can still access the address of `p` itself. As above, your proper allocation of `p` to begin with is `int *p = malloc (sizeof *p);` – David C. Rankin Nov 07 '15 at 16:21
  • 1
    Why wouldn't it be *undefined*? If the `malloc()` implementation were to use `mmap()` to satisfy the `malloc()` request, it could very well `munmap()` the memory the pointer addresses upon the call to `free()`. The pointer could very well contain an invalid address. – Andrew Henle Nov 07 '15 at 16:22
  • I don't say the target address is valid/defined. But `p` *is* valid in the scope of `main()` and can be used as any other variable, pointer or not. You can still print, add, set it. But its *destination* is not valid and undefined, so trying to dereference it would lead to unefined behavior, of course. – hexasoft Nov 07 '15 at 16:30
  • You should probably edit the question to ask about dereferencing `p` after `free(p)`, because as written your code is doesn't do anything bad. – Peter Cordes Nov 07 '15 at 16:30
  • 2
    @PeterCordes That'd be too easy, and then we wouldn't be exploring the boundaries of the C standard. – Andrew Henle Nov 07 '15 at 16:34
  • @hexasoft no, that's unfortunately wrong. If a pointer doesn't point to valid memory (but it is not `NULL`), it's undefined behavior to even inspect the value of the pointer itself. – The Paramagnetic Croissant Nov 07 '15 at 16:37
  • @DavidC.Rankin it **is** undefined. It's forbidden to inspect the value of a pointer (other than `NULL`) that does not point to valid memory. – The Paramagnetic Croissant Nov 07 '15 at 16:38
  • @AndrewHenle: oh yes, I see from the title that this was in fact in the intended question. It's pretty clear. Clearly I'm too much of an asm geek to even consider the possibility that the answer wasn't a trivial "no, it's allowed". – Peter Cordes Nov 07 '15 at 16:38
  • @TheParamagneticCroissant *It's forbidden to inspect the value of a pointer (other than NULL) that does not point to valid memory* Can you provide the cite? I have been able to find that in the standard other than my answer referencing 6.2.4 and the value becoming indeterminate. – Andrew Henle Nov 07 '15 at 16:43
  • Well, once I wrote a program that computes addresses of symbols in objects (based of `nm` values and `/proc/PID/maps` content), for a covering tool. These addresses were stored in pointers, with calculus on them (to find symbols for debug). These pointers don't point to anything real. How would you manage addresses outside of your scope without using pointers? – hexasoft Nov 07 '15 at 16:44
  • @hexasoft using `intptr_t` / `unitptr_t`. – The Paramagnetic Croissant Nov 07 '15 at 16:45
  • @AndrewHenle Searching… – The Paramagnetic Croissant Nov 07 '15 at 16:45
  • 2
    @TheParamagneticCroissant I think Peter Cordes said it best in his comment on my answer - some hardware *could* treat pointers as a special type of data, and accessing an indeterminate pointer value *could* trigger a trap. Hence, UB. – Andrew Henle Nov 07 '15 at 16:48

4 Answers4

7

Expanding on Andrew Henle's answer:

From the C99 Standard, 6.2.4:

An object has a storage duration that determines its lifetime. There are three storage durations: static, automatic, and allocated. Allocated storage is described in 7.20.3. […] The value of a pointer becomes indeterminate when the object it points to (or just past) reaches the end of its lifetime.

Then in 7.20.3.2: the standard goes on describing malloc(), calloc() and free(), mentioning that

The free function causes the space pointed to by ptr to be deallocated.

In 3.17.2:

indeterminate value

either an unspecified value or a trap representation

In 6.2.6.1.5:

Certain object representations need not represent a value of the object type. If the stored value of an object has such a representation and is read by an lvalue expression that does not have character type, the behavior is undefined. […] Such a representation is called a trap representation.

Since the pointer becomes indeterminate, and an indeterminate value can be a trap representation, and you have a variable which is an lvalue, and reading an lvalue trap representation is undefined, therefore yes, the behavior may be undefined.

  • The wording of 6.2.6.1.5 implies that it's only undefined behaviour if `p` actually does hold a trap representation. Since normal platforms don't have any trap representation for pointers, it sounds to me like it's only undefined behaviour on those very odd platforms that might do something like check a pointer's validity when it's loaded into a special pointer register. (Maybe to allow implementations to do virtual-to-physical translations once at pointer-load time, rather than on each dereference?) On normal CPUs like x86, only FP types have trap representations. – Peter Cordes Nov 07 '15 at 17:22
  • @PeterCordes ah, yes, you are right! I've amended the wording of my answer. – The Paramagnetic Croissant Nov 07 '15 at 17:23
  • In summary, unlike most undefined behaviour, this is target-dependent. So there's no risk of the compiler doing weird stuff unless compiling for very esoteric hardware. There's probably no hardware that actually *does* work this way. I'd guess the C standard just ends up covering this case because of how it chooses to define terms and word things. (i.e. pointers to freed memory falling into the same category as uninitialized FP data.) Anyway, very nice job digging up all the jigsaw pieces to make this answer. – Peter Cordes Nov 07 '15 at 17:27
  • 1
    @PeterCordes "There's probably no hardware that actually does work this way" – sure, I wasn't asserting the converse. It's just that it can *potentially* be undefined regardless, and from a language-lawyer point of view, that's the only thing that matters. – The Paramagnetic Croissant Nov 07 '15 at 17:29
  • Yup, exactly. That's why it was important to realize that it not only happens to work on real hardware, it's not in fact undefined behaviour at all on any normal CPU. If you're targeting super-weird hardware where this comes up, this will be one of many porting problems, so IMO it's typically not a problem to write code that would have undefined behaviour on such systems. – Peter Cordes Nov 07 '15 at 17:36
4

Per the C standard, section 6.2.4:

The lifetime of an object is the portion of program execution during which storage is guaranteed to be reserved for it. An object exists, has a constant address, and retains its last-stored value throughout its lifetime. If an object is referred to outside of its lifetime, the behavior is undefined. The value of a pointer becomes indeterminate when the object it points to (or just past) reaches the end of its lifetime.

Andrew Henle
  • 32,625
  • 3
  • 24
  • 56
  • 3
    The relevant sentence is the last one, not the one you bolded. The pointed-to object is not accessed here, only the pointer itself is. – interjay Nov 07 '15 at 16:25
  • Agreed. I'll change the bolding. Although in practice nothing happens. This is rather pedantic. – Andrew Henle Nov 07 '15 at 16:26
  • But here he just prints the content of `p`. Its content still exists (because `p` still exists in the context of `main`), even if the address don't means nothing. It can still be used for other purposes (don't know which purpose, and it seems a bad idea, but `p` is usable). – hexasoft Nov 07 '15 at 16:27
  • @hexasoft - Yeah, it's rather pedantic and I'm willing to change the answer or even delete it if someone comes up with a better explanation. – Andrew Henle Nov 07 '15 at 16:28
  • It still isn't clear from this answer whether accessing an indeterminate value is allowed. – interjay Nov 07 '15 at 16:29
  • @interjay - Agreed. It seems similar to accessing uninitialized data. I don't know where that's covered in the C standard or even if it's covered at all. Again, if someone finds a better answer I have no problem conceding that. – Andrew Henle Nov 07 '15 at 16:31
  • Can you dig up the part of the standard that says what it means for a pointer to be indeterminate? Is it possible that becoming indeterminate can change a pointer's value? (I can't think of any actual implementation where that would happen, on hardware that was anything like a normal CPU with registers and memory.) Or does it just mean you can't safely dereference it or get any useful information from comparing it to anything? – Peter Cordes Nov 07 '15 at 16:33
  • Not sure it is covered. A pointer (as a variable) don't need to contain a real destination (i.e. people shifting a pointer back to have arrays that start at 1 and not 0 (ugly, but it works…)). It is the dereferencing action that gives a sense to the pointer in my opinion. – hexasoft Nov 07 '15 at 16:35
  • @PeterCordes - "indeterminate value" is defined in **3.19.2** as *either an unspecified value or a trap representation*. If you follow the definitions of those, it seems to dead-end. – Andrew Henle Nov 07 '15 at 16:38
  • 1
    @AndrewHenle why is it a dead end? if it's a trap representation, then clearly it is UB to use it. – The Paramagnetic Croissant Nov 07 '15 at 16:42
  • 1
    @TheParamagneticCroissant says it's undefined behaviour to even inspect the value of a pointer that doesn't point to valid memory. Maybe some hardware treats pointers differently from integers, and e.g. loading a value into a pointer register does access checking on it? (And a compiler might copy pointer values around by loading/storing to/from pointer registers?) That's like having a dynamic definition of a trap representation, I guess. Super weird, but I guess I can imagine there's some HW that works that way. – Peter Cordes Nov 07 '15 at 16:44
  • If the standard says it's undefined, then it *is*, but I'm trying to figure out *why* the standard would say that. I'm not one of those people trying to argue that stuff that works in practice is fine, regardless of the standard. – Peter Cordes Nov 07 '15 at 16:45
  • 1
    @PeterCordes: Evaluating a pointer as an rvalue may cause the compiler to generate code which attempts to use part of the pointer as something like a segment descriptor, and attempting such an operation on an invalid pointer could cause bad things to happen. In practice, compilers aren't apt to waste time loading segment descriptors for pointers that are never dereferenced and never used in pointer arithmetic, but the Committee wanted to leave open the possibility of compilers that would croaking on invalid pointers, even though that precludes what would otherwise be helpful optimizations... – supercat Nov 10 '15 at 00:14
0

If a compiler correctly determines that code will inevitably fetch a pointer to an object which has been passed to "free" or "realloc", even if code will not make any use of the object identified thereby, the Standard will impose no requirements on what the compiler may or may not do after that point.

Thus, using a construct like:

char *thing = malloc(1000);
int new_size = getData(thing, ...whatever); // Returns needed size
char *new_thing = realloc(thing, new_size);
if (!new_thing)
  critical_error("Shrinking allocation failed!");
if (new_thing != thing)
  adjust_pointers(thing, new_thing);
thing = new_thing;

might on most implementations allow code to save the effort of recalculating some pointers in the event that using realloc to shrink an allocated block doesn't cause the block to move, but there would be nothing illegitimate about an implementation that unconditionally reported that the shrinking allocation failed since if it didn't fail code would inevitably attempt a comparison involving a pointer to a realloc'ed block. For that matter, it would also be just as legitimate (though less "efficient") for an implementation to keep the check whether realloc returned null, but allow arbitrary code to execute if it doesn't.

Personally, I see very little to be gained by preventing programmers from determining testing when certain steps can be skipped. Skipping unnecessary code if a pointer doesn't change may yield significant efficiency improvements in cases where realloc is used to shrink a memory block (such an action is allowed to move the block but on most implementations it usually won't), but it is currently fashionable for compilers to apply their own aggressive optimizations which will break code that tries to use such techniques.

supercat
  • 77,689
  • 9
  • 166
  • 211
-2

Continuing from the comments. I think the confusion over whether it is valid or invalid surrounds what aspect of the pointer is being asked about. Above, free(p); effects the starting address to the block of memory pointed to by p, it does not effect the address of p itself, which remains valid. There is no longer an address held by p (as it's value) leaving it indeterminate until reassigned. A short example helps:

#include <stdio.h>
#include <stdlib.h>

int main (void) {

    int *p = NULL;

    printf ("\n the address of 'p' (&p) : %p\n", &p);

    p = malloc (sizeof *p);
    if (!p) return 1;

    *p = 3;

    printf (" the address of 'p' (&p) : %p   p points to %p   with value  %d\n",
            &p, p, *p);

    free (p);

    /* 'address of p' unchanged, p itself indeterminate until reassigned */
    printf (" the address of 'p' (&p) : %p\n\n", &p);

    p = NULL;  /* p no longer indeterminate and can be allocated again */

    return 0;
}

Output

$ ./bin/pointer_addr

 the address of 'p' (&p) : 0x7fff79e2e8a0
 the address of 'p' (&p) : 0x7fff79e2e8a0   p points to 0x12be010   with value  3
 the address of 'p' (&p) : 0x7fff79e2e8a0

The address of p itself is unchanged by either the malloc or free. What is effected is the value of p (or more correctly, the address p stores as its value). Upon free, the address p stores is released to the system and can no longer be accessed through p. Once you explicitly reassign p = NULL; p is no longer indeterminate and can be used for allocation again.)

David C. Rankin
  • 81,885
  • 6
  • 58
  • 85
  • 2
    "it does not effect p itself" – mostly. But an implementation would be allowed to e.g. change it to `NULL`, or anything else, because its value becomes indeterminate. "It outputs the stuff I expected on my system" is **not** a proof that it's not undefined behavior. – The Paramagnetic Croissant Nov 07 '15 at 16:48
  • The behavior of above program is undefined because %p format requires argument of type void*. – Destructor Nov 07 '15 at 16:50
  • 1
    Happening to work in practice on a modern 64bit CPU doesn't mean it works on *all* hardware ever. Obviously a "normal" cpu with essentially general-purpose registers won't have a problem. I think it's unlikely that the value would *change*. I'd bet that it either works as expected or faults. – Peter Cordes Nov 07 '15 at 16:50
  • I don't know if we are quibbling over the vernacular or what, but `&p` does not change and is not undefined above -- what am I'm missing?? – David C. Rankin Nov 07 '15 at 16:52
  • @TheParamagneticCroissant there is no attempt to access an indeterminate value above. What are you contending the example is not a proof of? – David C. Rankin Nov 07 '15 at 16:54
  • @DavidC.Rankin "there is no attempt to access an indeterminate value above" – there is. The pointer itself becomes indeterminate after the call to `free()`, because the lifetime of the pointed object ends. The example is not a proof of your assertion, i.e. that using a `free()`'d pointer is not undefined behavior. – The Paramagnetic Croissant Nov 07 '15 at 16:57
  • I see what you are saying now. How is the rule effected by and intervening `p = NULL;` after `free` but before the 3rd `printf`? It seems that there is another part of a rule that is missing, because there is nothing that prevents the re-use of a pointer, and to take the literal interpretation cited by all above would preclude ever reusing a pointer within its scope of declaration after a call to `free`. – David C. Rankin Nov 07 '15 at 17:04
  • @DavidC.Rankin If you do `p = NULL;` immediately after `free(p)` and before printing it again, then you are not **reading** `p`, you are assigning it, and it no longer has an indeterminate value. – The Paramagnetic Croissant Nov 07 '15 at 17:08
  • 1
    Thank you. I knew the use, but did not consider, for purpose of the example, taking the address of `p` as referencing an indeterminate value (which it does not), but now see from the standpoint of the discussion, `p` (as opposed to the address of `p`) would be indeterminate at that point in time. (I suspect another cup of coffee is required...) – David C. Rankin Nov 07 '15 at 17:13