Invalid pointer becoming valid again

Question

int *p;
{
    int x = 0;
    p = &x;
}
// p is no longer valid
{
    int x = 0;
    if (&x == p) {
        *p = 2;  // Is this valid?
    }
}

Accessing a pointer after the thing it points to has been freed is undefined behavior, but what happens if some later allocation happens in the same area, and you explicitly compare the old pointer to a pointer to the new thing? Would it have mattered if I cast &x and p to uintptr_t before comparing them?

(I know it's not guaranteed that the two x variables occupy the same spot. I have no reason to do this, but I can imagine, say, an algorithm where you intersect a set of pointers that might have been freed with a set of definitely valid pointers, removing the invalid pointers in the process. If a previously-invalidated pointer is equal to a known good pointer, I'm curious what would happen.)

The moment you free a pointer, referring to it becomes undefined behaviour regardless of what the memory it points to is used for afterwards. These are the semantics. — sshine, Aug 22 '13 at 14:52
While I'm pretty sure the standard says this is undefined behaviour, I'm not exactly sure. Good question! — orlp, Aug 22 '13 at 14:53
I think it is valid if (**and only if**) explicitly compare tho old pointer as you wrote. But why should one do so instead just reassigning the pointer — Ingo Leonhardt, Aug 22 '13 at 14:54
This is valid, since you are checking that `p` points to a valid address; however nothing guarantees that it will happen that `p` point to the second `x`. — Arnaud Le Blanc, Aug 22 '13 at 14:55
I think this is perfectly valid use - given you actually check if it is a location which has the same address as the local variable's. Nice question. — Manoj Awasthi, Aug 22 '13 at 14:57
Won't the compiler allocate both x's at the stack at the beginning of the function? — Daan Timmer, Aug 22 '13 at 14:58
@DaanTimmer the compiler is free to do so. However, since both `x` do not exist at the same time, the compiler can reuse the stack region allocated for the first one. — Arnaud Le Blanc, Aug 22 '13 at 15:00
@IngoLeonhardt: Added a possible reason to do this. (Not that it's my reason for asking the question, though; I'm just curious about corner cases like this.) — user2357112, Aug 22 '13 at 15:01
For the sake of making what I find interesting about this question (+1) more obvious, I think it can be rephrased as: if I can assert `p == &x`, is `p` always interchangeable with `&x`? A similar statement is clearly true for integers. Is it for pointers? — R. Martinho Fernandes, Aug 22 '13 at 15:02
usernumbers: It would certainly not matter whether you chose to cast both pointers to the same pointer type before the comparison. — Dan, Aug 22 '13 at 15:03
@Dan: `uintptr_t` isn't a pointer type; it's an unsigned integer. I'm wondering if the comparison semantics for pointers and integers are different in a way that would affect this. — user2357112, Aug 22 '13 at 15:05
R. Martinho Fernandes: Not if `p` and `&x` are different types, such as a `void*` and a `struct something*`, or if `p` is a (multidimensional?) array and `&x` is a pointer to the first element. — Dan, Aug 22 '13 at 15:06
@Dan If we rephrased the question we wouldn't rephrase it into a different context, would we? — R. Martinho Fernandes, Aug 22 '13 at 15:14
@R.MartinhoFernandes: It's not clearly true for integers. Accessing the value of an uninitialized integer object has undefined behavior. An integer type can even have trap representations (though in practice most implementations don't have trap representations for integer types). — Keith Thompson, Aug 22 '13 at 15:31
@KeithThompson: IMHO, the standard would be greatly improved if it were to have the *existence* of trap values for integers be *Implementation-Defined*, but have a read of an integer with a trap value be *Undefined Behavior*. In that case, reading of Indeterminate Values of integer types would be legitimate *on platforms which do not document the existence of trap values*. — supercat, Apr 13 '15 at 16:28
@supercat: For most implementations, you can already determine the absence of any trap representations by checking `INT_MIN`, `INT_MAX`, `CHAR_BIT`, and `sizeof (int)`. For example, on a typical 32-bit 2's-complement implementation, you can prove at compile time that all bit patterns are for distinct valid values. (If there are padding bits, there may or may not be trap representations, but most implementations have neither.) But accessing an indeterminate value of a type other than `unsigned char` is *still* undefined behavior (N1570 6.3.2.1p2, last sentence). — Keith Thompson, Apr 13 '15 at 17:45
@supercat: This is intentional; it enables some optimizations. If you argue that the language would be improved if reading an uninitialized object didn't have undefined behavior, I won't necessarily disagree -- but what exactly is the benefit of doing so? If your program reads an uninitialized variable, isn't it buggy anyway? — Keith Thompson, Apr 13 '15 at 17:46
@KeithThompson: A program whose correctness would depend upon the value read from an uninitialized variable would be buggy, but if e.g. a function's first argument determines whether or not the function will use its second, passing an uninitialized variable to the function in some of the cases where it is going to be ignored may be more efficient than passing a known value. I would not be opposed to a compiler author deciding that passing an uninitialized variable was sufficiently likely to be a bug that trapping such things would be more useful than letting programmers save an instruction... — supercat, Apr 13 '15 at 18:07
...storing an initial value that was always going to be ignored (the compiler may have no way of knowing whether the called method would care about the argument or not, and thus could not optimize out an assignment if the programmer included one). The whole reason C zero-initializes static-duration variables but not automatic ones is to avoid having to generate semantically-irrelevant initializations, so saying that a programmer's omission of a semantically-irrelevant initialization should give a compiler free reign to do anything it likes would go against the whole reason the rule exists. — supercat, Apr 13 '15 at 18:12
@KeithThompson: If I were in charge of the standard, I would specify that many things that are presently UB would be required to either trap or yield Indeterminate Value, with the choice being Implementation Defined; implementations could document the consequences of traps in whatever degree of difficulty they saw fit (including having them be UB) provided that their behavior is consistent with the documentation provided. Under such a rule, any presently-conforming implementation could remain in conformance by simply saying that anything which presently invokes UB will be... — supercat, Apr 13 '15 at 18:17
..."trapped", but that the behavior of such traps is completely Undefined (and may include seeming-normal program execution), but program behavior would be fully defined on platforms where its actions don't trap, or where the consequences of traps are fully defined. If a compiler had `__assume` and `_ext_assume` directives, I would think those could provide all the benefits of hyper-modern inference without scrambling the semantics of what had been working production code. — supercat, Apr 13 '15 at 18:22

score 12 · Accepted Answer · answered Aug 22 '13 at 15:14

12

By my understanding of the standard (6.2.4. (2))

The value of a pointer becomes indeterminate when the object it points to (or just past) reaches the end of its lifetime.

you have undefined behaviour when you compare

if (&x == p) {

as that meets these points listed in Annex J.2:

— The value of a pointer to an object whose lifetime has ended is used (6.2.4).
— The value of an object with automatic storage duration is used while it is indeterminate (6.2.4, 6.7.9, 6.8).

answered Aug 22 '13 at 15:14

Daniel Fischer

181,706
17
308
431

Oooh good points. Seems this is another point where C and C++ have diverged. – R. Martinho Fernandes Aug 22 '13 at 15:15
2

Huh. You're not allowed to use the value of the pointer at all, rather than just not being allowed to dereference it? That's more restrictive than I would have expected. I wonder what the situation in C++ is; the C++ standard quote from the other answer doesn't say whether it's UB to use `p` in a comparison. – user2357112 Aug 22 '13 at 15:16
1

The standard is very restrictive. That gives a lot of freedom to implementations. I can't guarantee that my understanding of the standard is the same as the committee's however. – Daniel Fischer Aug 22 '13 at 15:21
Hmm, indeterminate is _"either an unspeciﬁed value or a trap representation"_. Unspecified is not a problem, only trap representation is. But `unsigned char` can not have a trap representation, so what if we replace `&x == p` with `int *q = &x; memcmp(&q, &p, sizeof(p)) == 0`? – orlp Aug 22 '13 at 15:21
You can use the pointer variable (to store a new value into it) because `p` remains defined throughout. After the first `x` has gone out of scope, you can't use the value stored in `p` legitimately. – Jonathan Leffler Aug 22 '13 at 15:22
Which C standard is this? I'd like to look at the sections you quoted, to get a bit more context. – user2357112 Aug 22 '13 at 15:26
@user2357112 It's the N1570 draft of the 2011 standard. Let me search for a link. [Here it is](http://www.open-std.org/jtc1/sc22/wg14/www/docs/n1570.pdf) – Daniel Fischer Aug 22 '13 at 15:28
@nightcracker In C90, using an indeterminate lvalue was explicitly listed as a source of undefined behavior. In C99 the committee complicated things but managed to write in the rationale “any use of an indeterminate value is undefined behavior”. – Pascal Cuoq Aug 22 '13 at 15:32
@PascalCuoq How is "using" defined? – orlp Aug 22 '13 at 15:36
@nightcracker Haha! It is never defined. I have had arguments with John Regehr about whether `f(p);` when `f()` is a function that does not use its argument was “using” `p`. – Pascal Cuoq Aug 22 '13 at 15:38
@DanielFischer: Thanks for the link. It looks like this answer is correct, assuming comparing a pointer counts as using it. – user2357112 Aug 22 '13 at 17:27
@user2357112 That is a safe assumption. Comparing two objects uses the values of these objects. The question is whether using an indeterminate value is always unconditionally undefined behaviour, or only if it might be a trap representation. – Daniel Fischer Aug 22 '13 at 17:38

score 5 · Answer 2 · edited Dec 12 '19 at 14:02

5

Okay, this seems to be interpreted as a two- make that three part question by some people.

First, there were concerns if using the pointer for a comparison is defined at all.

As is pointed out in the comments, the mere use of the pointer is UB, since $J.2: says use of pointer to object whose lifetime has ended is UB.

However, if that obstacle is passed (which is well in the range of UB, it can work after all and will on many platforms), here is what I found about the other concerns:

Given the pointers do compare equal, the code is valid:

C Standard, §6.5.3.2,4:

[...] If an invalid value has been assigned to the pointer, the behavior of the unary * operator is undefined.

Although a footnote at that location explicitly says. that the address of an object after the end of its lifetime is an invalid pointer value, this does not apply here, since the if makes sure the pointer's value is the address of x and thus is valid.

C++ Standard, §3.9.2,3:

If an object of type T is located at an address A, a pointer of type cv T* whose value is the address A is said to point to that object, regardless of how the value was obtained. [ Note: For instance, the address one past the end of an array (5.7) would be considered to point to an unrelated object of the array’s element type that might be located at that address.

Emphasis is mine.

edited Dec 12 '19 at 14:02

MSeifert

145,886
38
333
352

answered Aug 22 '13 at 15:03

Arne Mertz

24,171
3
51
90

5

That's C++, isn't it? – Daniel Fischer Aug 22 '13 at 15:07
1

@Arne I get the point you want to convey in the last paragraph, but *as written* it is not entirely true. There are compile-time related forms of UB. The IMO most insidious forms of UB are actually related to the compilation/linking bits, namely ODR violations (because they often are "no diagnostic required"). I'm having trouble finding a way to rephrase your text into a pedantically correct form, though :( – R. Martinho Fernandes Aug 22 '13 at 15:11
If you can find similar language in the C standard, I'll accept this. In the meantime, it's interesting to know the C++ standard has language addressing this. – user2357112 Aug 22 '13 at 15:12
@R.MartinhoFernandes ODR is part of "well formed" (§1.3.26) and UB is only mentioned in §3.2,5 for "different" template definitions. – Arne Mertz Aug 22 '13 at 15:17
3

I think that the quote you cite does not imply that the program is defined in C++. If using `p` in `&x == p` is in itself undefined behavior in C++, it does not matter that after the comparison, `p` points to `x` according to your quote. Undefined behavior has already happened. – Pascal Cuoq Aug 22 '13 at 15:19
The trouble is that once the object has ceased to exist, any pointers to it are invalid. If the variable is live, then it doesn't matter how the pointer was obtained. Once the variable has been deallocated (as the first `x` has by the time the second block is executed, the pointer in `p` is no longer valid — period. – Jonathan Leffler Aug 22 '13 at 15:20
@Arne Not all the time. The definition of "well formed" explicitly includes only the diagnosable rules. There are many ODR-related rules that are "no diagnostic required" (I can go look them up and provide references if you want). (Also, probably better to not pollute this comment thread with this offside remark; chat?) – R. Martinho Fernandes Aug 22 '13 at 15:24
This really only is for C++, C has a no such wording, and as far as I remember not even the concept of addresses. See my answer for the C concept of pointer comparison. – Jens Gustedt Aug 22 '13 at 15:25
@JensGustedt: C certainly does have the concept of addresses; see the definition of unary `&`. – Keith Thompson Aug 22 '13 at 15:27
@KeithThompson But it doesn't have the phrase **regardless of how the value was obtained**. A pointer value in C is pointing to an object. The two `x` are two different objects in the sense of the C standard, see the slightly modified example towards the end of my answer. So the comparison must always be false, even if we assume that the behavior would be defined. – Jens Gustedt Aug 22 '13 at 15:39
@JensGustedt: Agreed, mostly. But your "even if we assume that the behavior would be defined" doesn't make much sense; the behavior *isn't* defined, and in practice the comparison may be either true or false (or, in principle, a suffusion of yellow). – Keith Thompson Aug 22 '13 at 15:49
I think the standard quote you used to justify the validity of the comparison is only specifying the restrictions on a well-formed comparison, not one that's okay to execute. I'm pretty sure trap values like -0 or Not a Thing won't give a nice 0 or 1 comparison result. – user2357112 Aug 22 '13 at 15:49
@JensGustedt a pointer value is not necessarily pointing to an object. Think of uninitialized pointers and pointers after the end of an array - both do not point to an actual object. – Arne Mertz Aug 22 '13 at 15:50
1

"This explicitly includes otherwise invalid pointers (the ones past the end of an array)" those are explicitly valid values of pointers. These are not valid operands of the indirection operator, but for comparisons (`==` and `!=` always, `<` and `>` if the other operand points into or one past the same array). However, here an indeterminate value is used, that's an entirely different kettle of fish. – Daniel Fischer Aug 22 '13 at 16:26
Your reasoning is completely flawed. Because one clause in the standard does not say that a particular instance of `e1 == e2` is undefined, you conclude that it is defined. Sorry, this is not how the C standard works. By the same reasoning, `x == y` should always be defined when `x` and `y` have type `int`, because of `both operands have arithmetic type`. Well, `x == y` is undefined if `x` or `y` is an uninitialized automatic variable. – Pascal Cuoq Aug 22 '13 at 20:19
C99 J.2 **Undefined behavior** The value of a pointer to an object whose lifetime has ended is used (6.2.4). – Pascal Cuoq Aug 22 '13 at 20:22
“Regardless how an invalid pointer is created, any use of it yields undefined behavior. Even assignment, comparison with a null pointer constant, or comparison with itself, might on some systems result in an exception.” http://www.open-std.org/jtc1/sc22/wg14/www/C99RationaleV5.10.pdf – Pascal Cuoq Aug 22 '13 at 20:40

score 1 · Answer 3 · answered Aug 22 '13 at 15:15

1

It will probably work with most of the compilers but it still is undefined behavior. For the C language these x are two different objects, one has ended its lifetime, so you have UB.

More seriously, some compilers may decide to fool you in a different way than you expect.

The C standard says

Two pointers compare equal if and only if both are null pointers, both are pointers to the same object (including a pointer to an object and a subobject at its beginning) or function, both are pointers to one past the last element of the same array object, or one is a pointer to one past the end of one array object and the other is a pointer to the start of a different array object that happens to immediately follow the first array object in the address space.

Note in particular the phrase "both are pointers to the same object". In the sense of the standard the two "x"s are not the same object. They may happen to be realized in the same memory location, but this is to the discretion of the compiler. Since they are clearly two distinct objects, declared in different scopes the comparison should in fact never be true. So an optimizer might well cut away that branch completely.

Another aspect that has not yet been discussed of all that is that the validity of this depends on the "lifetime" of the objects and not the scope. If you'd add a possible jump into that scope

{
    int x = 0;
    p = &x;
  BLURB: ;
}
...
if (...)
...
if (something) goto BLURB;

the lifetime would extend as long as the scope of the first x is reachable. Then everything is valid behavior, but still your test would always be false, and optimized out by a decent compiler.

From all that you see that you better leave it at argument for UB, and don't play such games in real code.

answered Aug 22 '13 at 15:15

Jens Gustedt

76,821
6
102
177

"So an optimizer might well cut away that branch completely." So, it's not undefined behaviour. If the branch cannot ever be taken, a compiler that makes it so is buggy. If the branch *is ever taken* and that isn't a bug, they must point to the same object and not to distinct ones. – R. Martinho Fernandes Aug 22 '13 at 15:17
No, they might point to the same memory location, that doesn't make the two `x` the same object. Note that comparison of pointers is about objects, not about memory addresses. And as I said, it is UB anyhow, so a compiler may do what pleases. – Jens Gustedt Aug 22 '13 at 15:18
That contradicts the part you quoted. "Two pointers compare equal **if and only if** (...) both are pointers to the same object (...)". Either they don't compare equal, or they point to the same object (I hope I can discard the other possibilities listed for this argument). – R. Martinho Fernandes Aug 22 '13 at 15:20
The only other possibility is that *the check itself* does not have defined behaviour, something which another answer mentions, but this one doesn't. If the check's behaviour is defined, the standard seems quite clear on what that behaviour is. – R. Martinho Fernandes Aug 22 '13 at 15:21
You can't jump into a block past variables with initializers, so jumping to BLURB is not allowed (either strictly disallowed or invokes UB because it bypasses the initializers at the start of the block). – Jonathan Leffler Aug 22 '13 at 15:26
@JonathanLeffler, sure you can, once you have met the initializer, you can jump back into the block without problems. `longjmp` does that all the time. The only case were that is forbidden is when the scope contains a VLA. – Jens Gustedt Aug 22 '13 at 15:30
@R.MartinhoFernandes, you still mix up the two concepts "object" and the address in memory where an object is located. These are not the same. In C, an object is given by a definition or an allocation (AKA `malloc`). Here there are two definitions in different scopes, so for C there are two objects. – Jens Gustedt Aug 22 '13 at 15:43
@JensGustedt then, unless the check is undefined (which you don't mention), the pointers *must not compare equal* (and with that all the code that runs has defined behaviour). – R. Martinho Fernandes Aug 22 '13 at 15:44
ISO/IEC 9899:2011 §6.8 Statements and blocks, ¶3 A block allows a set of declarations and statements to be grouped into one syntactic unit. The initializers of objects that have automatic storage duration, and the variable length array declarators of ordinary identifiers with block scope, are evaluated and the values are stored in the objects (including storing an indeterminate value in objects without an initializer) each time the declaration is reached in the order of execution, as if it were a statement, and within each declaration in the order that declarators appear._ [...continued...] – Jonathan Leffler Aug 22 '13 at 16:07
[...continuation...] Since your jump to BLURB misses the declaration 'in the order of execution', the behaviour is not defined by the standard. You're right though; the constraint on `goto` is 'A goto statement shall not jump from outside the scope of an identifier having a variably modified type to inside the scope of that identifier.' Your code is not violating that constraint. – Jonathan Leffler Aug 22 '13 at 16:10
@JonathanLeffler, the text that you are citing only imposes that the declaration must be reached once, before the object is used. I doesn't inhibit any later jump into the block. The constraint on the `goto` that you are citing is the only one. BTW gcc and clang happily accept it. – Jens Gustedt Aug 22 '13 at 16:27
The compilers are under no obligation to diagnose a problem. I don't agree, but I don't wish to continue the discussion. – Jonathan Leffler Aug 22 '13 at 16:31
@JonathanLeffler, right this becomes fruitless. And I don't claim that this is good coding style ... – Jens Gustedt Aug 22 '13 at 16:33

score 0 · Answer 4 · answered Aug 22 '13 at 14:55

0

It would work, if by work you use a very liberal definition, roughly equivalent to that it would not crash.

However, it is a bad idea. I cannot imagine a single reason why it is easier to cross your fingers and hope that the two local variables are stored in the same memory address than it is to write p=&x again. If this is just an academic question, then yes it's valid C - but whether the if statement is true or not is not guaranteed to be consistent across platforms or even different programs.

Edit: To be clear, the undefined behavior is whether &x == p in the second block. The value of p will not change, it's still a pointer to that address, that address just doesn't belong to you anymore. Now the compiler might (probably will) put the second x at that same address (assuming there isn't any other intervening code). If that happens to be true, it's perfectly legal to dereference p just as you would &x, as long as it's type is a pointer to an int or something smaller. Just like it's legal to say p = 0x00000042; if (p == &x) {*p = whatever;}.

answered Aug 22 '13 at 14:55

Dan

10,531
2
36
55

3

Please, let's not get definitions confused. In questions like this it's very wise to reserve undefined behaviour as defined by the standard, rather than "the output may vary". – orlp Aug 22 '13 at 15:00
I haven't read the standard cover to cover because I have so many better things to do. I would expect that the standard does not define whether the two local variables here will have the same memory address. Therefore, that behavior is not defined. – Dan Aug 22 '13 at 15:02
5

Whether the two variables will have the same address is not the interesting part of this question. The interesting part is whether it's defined behaviour to write through the original pointer when they __do__. – orlp Aug 22 '13 at 15:06
@Dan the standard does define that they have the same memory address in the body of that if. That much should be obvious by the fact that the condition evaluated to true (and evaluating the condition does not involve undefined behaviour). – R. Martinho Fernandes Aug 22 '13 at 15:08
@R.MartinhoFernandes: But if evaluating the condition *does* involve undefined behavior, then all bets are off. The value of `p` is indeterminate after the end of the lifetime of the first `x`. – Keith Thompson Aug 22 '13 at 15:29
@Keith I agree. I prefer to see a standard quote on it, though (no worries, there's one in another answer) – R. Martinho Fernandes Aug 22 '13 at 15:31

score 0 · Answer 5 · answered Aug 22 '13 at 15:10

The behaviour is undefined. However, your question reminds me of another case where a somewhat similar concept was being employed. In the case alluded, there were these threads which would get different amounts of cpu times because of their priorities. So, thread 1 would get a little more time because thread 2 was waiting for I/O or something. Once its job was done, thread 1 would write values to the memory for the thread two to consume. This is not "sharing" the memory in a controlled way. It would write to the calling stack itself. Where variables in thread 2 would be allocated memory. Now, when thread 2 eventually got round to execution,all its declared variables would never have to be assigned values because the locations they were occupying had valid values. I don't know what they did if something went wrong in the process but this is one of the most hellish optimizations in C code I have ever witnessed.

Ingo Leonhardt · Answer 6 · 2013-08-23T10:04:29.207

0

Put aside the fact if it is valid (and I'm convinced now that it's not, see Arne Mertz's answer) I still think that it's academic.

The algorithm you are thinking of would not produce very useful results, as you could only compare two pointers, but you have no chance to determine if these pointers point to the same kind of object or to something completely different. A pointer to a struct could now be the address of a single char for example.

edited Aug 23 '13 at 10:04

answered Aug 22 '13 at 15:14

Ingo Leonhardt

9,435
2
24
33

“I think it is if and only if you explicitely compare” See C99 J.2 **Undefined behavior** The value of a pointer to an object whose lifetime has ended is used (6.2.4). – Pascal Cuoq Aug 22 '13 at 20:24
@Pascal Cuoq I have seen the discussion continued, after I have lefte the site yesterday. I'm convinced now – Ingo Leonhardt Aug 23 '13 at 10:05

score 0 · Answer 7 · answered Aug 22 '13 at 15:15

0

Winner #2 in this undefined behavior contest is rather similar to your code:

#include <stdio.h>
#include <stdlib.h>

int main() {
  int *p = (int*)malloc(sizeof(int));
  int *q = (int*)realloc(p, sizeof(int));
  *p = 1;
  *q = 2;
  if (p == q)
    printf("%d %d\n", *p, *q);
}

According to the post:

Using a recent version of Clang (r160635 for x86-64 on Linux):

$ clang -O realloc.c ; ./a.out

1 2

This can only be explained if the Clang developers consider that this example, and yours, exhibit undefined behavior.

answered Aug 22 '13 at 15:15

Pascal Cuoq

79,187
7
161
281

2

`*p = 1` already is UB, so I think the rest isn't really a proof although the code is cool! – Ingo Leonhardt Aug 22 '13 at 15:18
@IngoLeonhardt Why is `*p = 1` UB? I fail to see it o.0 – orlp Aug 22 '13 at 15:27
@IngoLeonhardt Yes, I realize now it is not exactly the same example. But note that what makes `*p = 1;` undefined in **this** example is `p`, not `*`: it is using an indeterminate lvalue, which was clearly undefined behavior in C90, is treated as undefined behavior by compiler makers in C99 (http://blog.frama-c.com/index.php?post/2013/03/13/indeterminate-undefined ) and, just discovered, is explicitly intended in the C99 rationale as UB (http://www.open-std.org/jtc1/sc22/wg14/www/C99RationaleV5.10.pdf ) although the standard itself is not unambiguous. – Pascal Cuoq Aug 22 '13 at 15:28
It hurts me. Once you reallocate p, you could not use it. Luckily it doesn't crash ... – Yann Droneaud Aug 22 '13 at 15:28
@nightcracker if `realloc()` returns a different pointer, you must not use the original one anymore – Ingo Leonhardt Aug 22 '13 at 15:32
@IngoLeonhardt Oh I entirely ignored the second line because my brain saw `int *q = SOMETHING_WITH_ALLOC`, my bad - I see it now. – orlp Aug 22 '13 at 15:33
FWIW: http://coliru.stacked-crooked.com/view?id=60e2f05e9ac19f191b4720a9c21cb9c6-7a28def917c3bd7a7eff406ea249874e – R. Martinho Fernandes Aug 22 '13 at 15:47
What if you did if (&*p == &*q) ? – UpAndAdam Aug 22 '13 at 18:35
Would the above be UB, or detected as UB, if `*p=1;` was moved before the realloc? If so, would there be any way by which a program which had fifty copies of a pointer scattered throughout memory to perform a realloc and skip the reassignment of the fifty copies in the event the pointer value didn't change? – supercat Apr 10 '15 at 18:54
@supercat I am not sure what you are asking, but perhaps you can find information in this post, including the title, “A dangling pointer is indeterminate”. http://trust-in-soft.com/dangling-pointer-indeterminate/ – Pascal Cuoq Apr 10 '15 at 20:40
@PascalCuoq: In the code as written, if the value of the pointer changes that would imply that the write to `*p` would be hitting random unowned memory, which is may cause arbitrary behavior on nearly all platforms. I don't think there is any platform where that case wouldn't invoke UB. By contrast, if the assignment to `*p` was before the realloc, then on many platforms the behavior in that case would be perfectly well-defined, with the comparison yielding false. The C standard does not require platforms to define behavior in such cases, but does not forbid them from doing so. – supercat Apr 13 '15 at 14:12
@PascalCuoq: One thing that really makes me dislike "modern" C as a language is that it will take production code which was perfectly legitimate *on platforms for which it was written*, and would run just fine *on 99% of platforms which didn't try to break it*, and doesn't merely cause it to fail with clearly-identifiable traps (which I would be cool with), but instead makes deliberate bizarro-land changes to the semantics – supercat Apr 13 '15 at 14:16
@supercat The value of the pointer changes in that it becomes indeterminate. I am not talking about any actual bit pattern, I am saying that compilers apply optimizations that assume that a pointer is not read (read, not even dereferenced) after the object it points to has been freed or otherwise ended its lifetime. – Pascal Cuoq Apr 13 '15 at 14:17
@supercat I can only offer my commiserations with respect to modern C. I also deplore the situation you describe, but I can suggest a tool that has the power to tell you if a piece of C code you are interested in exhibits undefined behavior (and thus puts you at the mercy of modern C compilers). – Pascal Cuoq Apr 13 '15 at 14:19
@PascalCuoq: Making that determination in general would be The Halting Problem. I wonder if the authors of the standard foresaw this sort of thing happening? IMHO, the standard could have benefited from adding quite a few network-protocol-style SHOULDs, and also by making clear certain kinds of inferences which compilers SHOULD NOT or MUST NOT make. For example, if a loop `do{ ... } while(fnorble != 1);` has no side-effects but is followed by a statement which has side-effects but does not depend upon any values computed in the loop, I would have said that a compiler is entitled... – supercat Apr 13 '15 at 15:13
...to execute the latter statement whether or not the loop terminates, but should not be allowed to presume that `fnorble` will ever equal one. I suspect what happened is that when the authors of the standard made the effects of Undefined Behavior retroactive, their intention was to simply say that Undefined Behavior is not a sequence point; I don't think they intended to allow inferences to be drawn beyond that. Given `if (shift >= 32) val=0; val>>=shift;` I would expect that if shift==32, a few platforms might trap, but the "natural" behavior of any platform that didn't trap... – supercat Apr 13 '15 at 15:23
...would be to yield `val=0`--in some cases faster than it would have with an `if/else` and in some cases slower. Having a compiler generate code that traps would be irksome, but it would at least make it possible to detect that the code was doing something the compiler didn't like. Do you know if there's any commonly-recognized terminology to say that a program is expecting sane behavior from such constructs, or to describe compiler modes that would generate sane behavior? – supercat Apr 13 '15 at 15:33
@supercat The intention of the “Friendly C” manifesto was that a compiler would be able to say it was Friendly-C-compliant. http://blog.regehr.org/archives/1180 – Pascal Cuoq Apr 13 '15 at 15:38
@PascalCuoq: That sounds like what I was hoping for, though I would probably want to *also* recognize a lesser level of friendliness where things like Integer Overflow would yield wholly- or partially-Indeterminate (not merely Unspecified) Value, but most operations on Indeterminate Values would be specified as yielding Indeterminate Values. Thus, for example, given `int16_t i=32767; int32_t L1,L2;` a sequence like `i++; L1=i; L2=i;` would not be required to assign the same values to `L1` and `L2`, but the bottom 16 bits of both values would be required to match. – supercat Apr 13 '15 at 16:05
@PascalCuoq: Requiring that something like an ARM store the same value into `L1` and `L2` would sometimes (especially if there was other code between the two assignments) require the compiler to generate extra code; having `i` be a partially-Indeterminate Value would allow code which didn't care about the upper bits to avoid having to deal with them, but would not allow the compiler to make inferences based upon the "impossibility" of overflow occurring. – supercat Apr 13 '15 at 16:13
@PascalCuoq: Any idea where I could best share my observations about that post and related issues? Perhaps chat? – supercat Apr 13 '15 at 16:17
@supercat First, it's only because of “comment spam” that John has had to close comments on posts after a period of time. I apologize for him, but without this counter-measure, the amount of spam grows quadratically as the blog grows. Second, anywhere you find convenient but it would be better if it was easy to share with non-StackOverflow users, as John and Matthew aren't. – Pascal Cuoq Apr 13 '15 at 16:37
@PascalCuoq: Should we open up a chat here? I just read the 2007 paper regarding "critical" vs "non-critical" ub; I think it makes important points, but would suggest that a better distinction would be to say that many operations may trap or yield Indeterminate Value. Implementations must define every action that might trap; they are not required to specify the consequences of traps (or any way of predicting whether traps will do anything), but any documentation they provide regarding such behavior must be correct. Still sad that someone noticed problems in 2007 and things have gotten worse. – supercat Apr 13 '15 at 16:47
Let us [continue this discussion in chat](http://chat.stackoverflow.com/rooms/75136/discussion-between-pascal-cuoq-and-supercat). – Pascal Cuoq Apr 13 '15 at 16:57

Invalid pointer becoming valid again

7 Answers7

Linked