7

I might be mistaken, but I seem to remember that for a given memory allocation, e.g.

char *p = malloc(4);

the pointer p is a valid pointer for all bytes within the allocation and for the first byte beyond that allocation.

Thus, to access memory through the pointer p only offsets p[0] .. p[3] are valid. But for pointer comparison &( p[4] ) would also be be a valid pointer.

Is that correct, and where in the C Standard (link) does it say so? It seems that 6.5.9 p6 might hint into the right direction for the answer, but it's a bit fuzzy still.

Jens
  • 8,423
  • 9
  • 58
  • 78
  • 2
    This is a bit contentious. I've seen it discussed before on c.l.c. Of course `p+4` is fine, but some were of the opinion that the expression `p[4]` causes undefined behaviour, even though you never access its value when you write `&p[4]`. The Standard seemed to forbid `&p[4]` on an extremely pedantic reading. – M.M Mar 12 '14 at 06:33
  • @MattMcNabb I find the idea that `&p[4]` should be considered UB more interesting than is good for me, and I would be very interested in a link to the c.l.c discussion or an approximative date in order to look for it myself. – Pascal Cuoq Mar 12 '14 at 06:39
  • Note that `p` should be made, say, a `char*` for the discussion to make sense. Pointer arithmetic is forbidden for pointers to `void`. – Pascal Cuoq Mar 12 '14 at 06:42
  • @PascalCuoq: Agreed, changed. – Jens Mar 12 '14 at 06:42

2 Answers2

5

&p[4], or p + 4 is a valid pointer, but it can't be derefrenced.

C11 6.5.6 Additive operators

[...] If both the pointer operand and the result point to elements of the same array object, or one past the last element of the array object, the evaluation shall not produce an overflow; otherwise, the behavior is undefined. If the result points one past the last element of the array object, it shall not be used as the operand of a unary * operator that is evaluated.

Community
  • 1
  • 1
Yu Hao
  • 119,891
  • 44
  • 235
  • 294
  • 1
    Thanks! I think [6.5.9 p6](http://port70.net/~nsz/c/c11/n1570.html#6.5.9) answers my question as well. – Jens Mar 12 '14 at 06:34
3

This answer assumes that p is a char *.

but for pointer comparison &( p[4] ) would be also be valid.

The pointer p + 4 (or &( p[4] ) is valid for comparison to p + N when N is in {0, 1, 2, 3, 4} with <, <=, or ==. This is noted in C11 6.5.8:5:

When two pointers are compared, the result depends on the relative locations in the address space of the objects pointed to. If two pointers to object types both point to the same object, or both point one past the last element of the same array object, they compare equal. If the objects pointed to are members of the same aggregate object, pointers to structure members declared later compare greater than pointers to members declared earlier in the structure, and pointers to array elements with larger subscript values compare greater than pointers to elements of the same array with lower subscript values. All pointers to members of the same union object compare equal. If the expression P points to an element of an array object and the expression Q points to the last element of the same array object, the pointer expression Q+1 compares greater than P. In all other cases, the behavior is undefined.

However, p+4 is not valid for comparison with == to, say, &X where X is another variable. This is (to the best of my C-standard deciphering) unspecified behavior. (And of course none of p + N is valid for comparison with <= to &X.)

Two pointers compare equal if and only if both are null pointers, both are pointers to the same object (including a pointer to an object and a subobject at its beginning) or function, both are pointers to one past the last element of the same array object, or one is a pointer to one past the end of one array object and the other is a pointer to the start of a different array object that happens to immediately follow the first array object in the address space.109)

109) Two objects may be adjacent in memory because they are adjacent elements of a larger array or adjacent members of a structure with no padding between them, or because the implementation chose to place them so, even though they are unrelated. If prior invalid pointer operations (such as accesses outside array bounds) produced undefined behavior, subsequent comparisons also produce undefined behavior.

(C11 6.5.9:6)

Strictly speaking, the standard does not seem to say anywhere that p + 4 == NULL is defined either (EDIT: as rici pointed out, the only allowance for p + 4 to be equal to q is if q is “the start of a different array object that happens to immediately follow…”. Since NULL is not the address of any object, it follows that p + 4 == NULL is false).

This blog post looks at this and other pointer comparisons in C.

Pascal Cuoq
  • 79,187
  • 7
  • 161
  • 281
  • Hmm, `p+4` might produce wrap-around, and be 0, with some architecture. So if it is not defined in standard to behave certain way, comparing to `NULL` seems undefined. – hyde Mar 12 '14 at 06:44
  • @n.m. … but in some cases the result is unspecified, or how do you interpret the lengthy sentence that continues into footnote 109? – Pascal Cuoq Mar 12 '14 at 06:46
  • @hyde Note that if `p + 4` can be `0`, then the compiler should not generate an unsigned comparison instruction for `int f(void *p, void *q) { return p < q; }`, because `f` would return the wrong result for `p` and `p + 4`. The standard does not say anywhere that `p + 4 == NULL` is defined, but the constraint that `p + 4 > p + 3` means, for most usual architectures, that the compiler does not have any more freedom than if it had. A related question is http://stackoverflow.com/questions/7058176/on-a-platform-where-null-is-represented-as-0-has-a-compiler-ever-generated-unex – Pascal Cuoq Mar 12 '14 at 06:50
  • @n.m "When two pointers are compared"..."In all other cases, the behavior is undefined." but that may be problem in the snippet, does that last sentence refer to entire paragraph or just previous sentence? – hyde Mar 12 '14 at 06:50
  • @PascalCuoq yeah, 0 seems unlikely, bad example, but `NULL` doesn't have to be 0 in hardware. – hyde Mar 12 '14 at 06:52
  • @hyde The clause that contains the words “when two pointers are compared” refers to `<` and `<=` comparison. The clause that defines `p == q`, on the other hand, does not use the words “undefined behavior” at all. See also this related answer (but you'll have to admit it is an answer to the question asked): http://stackoverflow.com/a/4023563/139746 – Pascal Cuoq Mar 12 '14 at 07:26
  • "Two pointers compare equal if and only if..." --- these are the cases when they compare equal, in all other cases they compare not equal, unspecified or undefined is not mentioned anywhere. – n. m. could be an AI Mar 12 '14 at 08:04
  • @n.m. This is what I hate about the C standard: it defines a notion of “unspecified behavior” and when a perfect opportunity arises to use the words, it doesn't. Nevertheless, what do you call a C construct that is not allowed to make demons fly out of your nose, but is allowed to return different values according to compilation choices outside the programmer's control? You call it “unspecified behavior”. That's how Derek Jones calls it, and he is a member of the C standard committee (http://lists.cs.uiuc.edu/pipermail/c-semantics/2011-June.txt ). – Pascal Cuoq Mar 12 '14 at 08:13
  • @n.m. If you think it should not be called that, you could find a member of the C standard committee that, in writing, calls it “defined” or denies that it should be called “unspecified”. – Pascal Cuoq Mar 12 '14 at 08:14
  • I don't understand what's the problem with any of these definitions. They are perfectly fine. You can `==` any two pointers; you can `<` any two pointers that point inside or past the same array; these things are defined, other pointer conparisons are undefined. What more is needed? – n. m. could be an AI Mar 12 '14 at 08:19
  • @n.m. says which member of the C standard committee? – Pascal Cuoq Mar 12 '14 at 08:21
  • The standard speaks for itself, the members of the standards committee have created it, and there their mission has ended. – n. m. could be an AI Mar 12 '14 at 08:37
  • @n.m. Yes, they have: “3.4.4 unspecified behavior: use of an unspecified value, or other behavior where this International Standard provides two or more possibilities and imposes no further requirements on which is chosen in any instance”. You understand that in the context of the C standard, “defined” and “undefined” are not the only two possibilities, right? – Pascal Cuoq Mar 12 '14 at 09:04
  • I understand what unspecified means. I don't see where in the paragraphs in question any reference is made to unspecified behavior or anything that could possibly be understood as unspecified. – n. m. could be an AI Mar 12 '14 at 09:17
  • @n.m. You don't see how the words “… that happens to …” from 6.5.9:6 make `&a + 1 == &b` unspecified? – Pascal Cuoq Mar 12 '14 at 09:20
  • Ah, OK. I see what you mean. I guess you can say it's unspecified. – n. m. could be an AI Mar 12 '14 at 10:12
  • @PascalCuoq: Comparing pointers "into" two different objects is unspecified but not undefined; the comparison must be either true or false, according to 6.5.9/3. So to my view, the comparison is valid but non-deterministic; the value must be 0 or 1. Also, 6.3.2.3/3 guarantees that NULL is not equal to a pointer to any object; thus the pair `p+4` and `NULL` are not in the list of possibilities that you quote from 6.5.9/6, and I conclude that `p+4==NULL` is defined as not equal (again, using 6.5.9/3). – rici Mar 12 '14 at 20:18
  • @PascalCuoq: sorry, I should have been clearer; I meant to restrict my comments to equality comparison, which I interpret to be defined but unspecified. Certainly p+4 is not a pointer to an object, but it is possible that an object starts there in memory. As I read the standard, it is not possible for p+4 to be == NULL, but it is possible for p+4 to be == q if q is a pointer to an object. – rici Mar 12 '14 at 20:56
  • @rici I see your point now. Clever interpretation of the standard: I agree that taking all these aspects into account, `p + 4 == NULL` is defined and must be `0`. – Pascal Cuoq Mar 12 '14 at 21:00
  • @rici: In clang and gcc, equality comparison between a pointer to the start of one array object, and a pointer "just past" the immediately preceding array object may yield behavior which is consistent neither with the comparison operator having yielded 0, nor with its having yielded 1. Such comparisons, *as processed by clang and gcc*, effectively invoke "anything can happen" Undefined Behavior, despite the fact that the Standard explicitly recognizes that case and specifies the behavior thereof. – supercat Jun 12 '23 at 17:33