3

In Section 6.7.3.1 of the C language standard regarding restrict, it says:

  1. Let D be a declaration of an ordinary identifier that provides a means of designating an object P as a restrict-qualified pointer to type T.

  2. ...

  3. In what follows, a pointer expression E is said to be based on object P if (at some sequence point in the execution of B prior to the evaluation of E) modifying P to point to a copy of the array object into which it formerly pointed would change the value of E.

I don't understand what this says - literally:

  • Who said P was pointing to a "copy of an array object"?
  • Why did P "formerly" point to anything? That is, who says we've changed its value?
  • Let's suppose E is a pointer of local scope. Why would modifying any pointer expression other than the E pointer itself "change the value of E"? It could change the value pointed to by E maybe. Right?

Can someone help me interpret that piece of text so as to make more sense?

(Inspired by this answer)

curiousguy
  • 8,038
  • 2
  • 40
  • 58
einpoklum
  • 118,144
  • 57
  • 340
  • 684

4 Answers4

1

Who said P was pointing to a "copy of an array object"?

Pointer arithmetic is defined (in C 2018 6.5.6 8 and 9) in terms of pointers to array elements. For this purpose, a single object is treated as an array of one element. So, whenever we have any non-null object pointer, it is, in this model, pointing into an array.

Why did P "formerly" point to anything? That is, who says we've changed its value?

The text you quoted is saying “To figure out if E is based on P, let’s hypothetically make a copy of the array that P is pointing into and then assign to P a pointer into the corresponding place in the copy.” So the text you quoted is saying we are changing the value of P, and then we are comparing the value of E with this change and without it.

Let's suppose E is a pointer of local scope. Why would modifying any pointer expression other than the E pointer itself "change the value of E"? It could change the value pointed to by E maybe. Right?

Objects and values do not have scope. Identifiers have scope. But let’s consider an identifier with block scope:

// P is a pointer into A.
// S is the size of A.
// A is the start of an array not contained in any other array.
void foo(char *P, size_t S, char *A)
{
    void *E = P+2;
}

For illustration, assume P has value 0x1004 and A is 0x1000. Is E based on P? Well, given the above, E is 0x1006. Suppose we consider this code before the definition of E:

    char *N = malloc(S);
    memcpy(N, A, S);
    P = P - A + N;

Suppose malloc returns 0x2000. What will the value of E be? It will be 0x2006. That is different from 0x1006. Therefore E is based on P.

On the other hand, consider this:

void foo(char **P, size_t S, char **A)
{
    #if OnOrOff
        char *N = malloc(S);
        memcpy(N, A, S);
        P = P - A + N;
    #endif
    char **E = P[3];
}

Now, will the value of E change depending on whether OnOrOff is true or false? No, in either case it will receive the value that is the referenced element of A, either directly or from the copy. The fact that P might point into A or N does not affect the value of E. So this E is not based on P.

Eric Postpischil
  • 195,579
  • 13
  • 168
  • 312
  • I think you hit upon the *intention* of the language in the Standard, but the way the Standard is actually worded is unworkable, resulting in some objects being "based upon" others they have nothing to do with, and in some cases not being based upon pointers from which their values were copied. – supercat Feb 27 '19 at 20:59
1

The definition of "based on" is intended to define a transitive relation among pointers, but its actual wording would yield an unworkable definition which so far as I can tell doesn't match any actual compiler behavior.

It would be simpler to transitively apply the following rule (and this is what compilers seem to do): If *p is a pointer of type T*, the following pointers are "based on" p:

  • p+(intExpr) or p-(intExpr)
  • (otherType*)p
  • &*p
  • &p->structMemberofNonArrayType or &p->unionMemberofNonArrayType
  • p->structMemberofArrayType or p->unionMemberofArrayType
  • &p[intExpr]
  • Any pointer based on any of the above

I don't think the Standard is really clear about (someType*)someIntegerFunction((uintptr_t)p) and I don't think compiler writers are clear either.

Note that any q derived from p via any of the above expressions except the one involving casts through uintptr_t, the difference between (char*)p and (char*)q will be independent of the address held by p.

Incidentally, here's an example of a problematic corner case:

int test1(int * restrict p1, int * restrict p2, int n)
{
    int *restrict p3 = p1+n;
    // How would p4 and p5 be affected if p3 were replaced
    // with a pointer to a copy here?
    int *p4 = p3;
    if (p3 != p1) p4=p1;
    int *p5 = p2 + (p3 == p1);
    *p3 = 1;
    *p5 = 2;
    return *p4;
}

Using the transitive ways of forming a pointer based on another, if n is zero, p4 would clearly be based upon p3. Pointer p5 would not derive from p3, however, since there is no sequence of "based upon" steps by which its value could be derived.

Trying to apply the rules given in the Standard to the n==0 case by replacing p3 with a pointer to a copy of the array would not affect the value of p4, but would affect the value of p5. That would imply that p4 is not based upon p3, but p5 is, somehow.

I would regard such a result as nonsensical, and I think the authors of the Standard would too, but it follows from the rules given in the Standard, as worded.

supercat
  • 77,689
  • 9
  • 166
  • 211
  • Yes, I know all of that, I've been using restrict for years. But I don't see how the wording says any of this. – einpoklum Feb 27 '19 at 00:02
  • @einpoklum: There are a limited number of actions one can do which would create a pointer with the required traits, and I don't think I've omitted any of them. – supercat Feb 27 '19 at 01:33
  • @einpoklum: I suppose I missed a way that a pointer could be "based on" another which my description doesn't cover, but since it would be unworkable and neither gcc nor clang recognizes it, I think my description more accurately describes what compilers actually do. [BTW, given `int x,y; int restrict *p;`, the value of `p==x ? &x : y` would be affected if `p` were equal to `x` and it were replaced with a copy of its contents.] – supercat Feb 27 '19 at 02:09
  • You're not getting what I'm saying. What you've written is just fine (or not, doesn't matter), but it's not an answer to my question about the wording in the standard. – einpoklum Feb 27 '19 at 08:35
  • @einpoklum: The wording in the Standard is inconsistent with the way compilers actually work, and making compilers uphold the Standard-described behavior in all defined cases would needlessly impair many useful optimizations. – supercat Feb 27 '19 at 15:20
  • @einpoklum: I think the authors of the Standard knew what they were going for as a concept and tried to describe it in general terms, but recognized that their description would fail in some corner cases. The "copy of the array" was intended to deal with those corner cases, but fails to fix all of them. The wording could perhaps have been said that the value of `(char*)p - (char*)e` would be unaffected by the replacement of `p` with a copy, but that's not what it actually says, and I think even that could cause problems with operations that go through `uintptr_t` or `intptr_t`. – supercat Feb 27 '19 at 15:29
  • Look, you keep talking about things which are simply not an answer to my question. – einpoklum Feb 28 '19 at 08:34
  • 1
    @einpoklum: A direct literal answer to your question would be "No. Nobody can help you interpret what the Committee wrote as a precise rule in a manner that makes more sense than the written text, because the text would be nonsensical if applied for that purpose." I would think that a description of what the rule was presumably intended to say would be more useful, but if all you want is a direct answer to your direct question, I guess "no" should suffice. – supercat Feb 28 '19 at 15:30
  • What about `memcpy`? – curiousguy Mar 09 '19 at 01:44
  • 1
    @curiousguy: If `memcpy` is used to copy a pointer, the behavior is analogous to converting a pointer to an integer (or actually sequence of integers) and then later forming a pointer from a sequence of integers. Any single set of rules would either characterize as UB some constructs that would be useful for some kinds of programs, or forbid some optimizations that would be useful for other types of programs. For most purposes, having any act which forms pointers from integers treat the pointers thus formed as being based upon any pointers that were converted to integers should be fine... – supercat Mar 10 '19 at 00:50
  • ...but the way the Standard is written doesn't say that. – supercat Mar 10 '19 at 00:51
  • @curiousguy: A principle that Committee probably thought it too obvious to be worth stating, is that the most practical way of handling `restrict` is to focus on cases where a compiler can prove that an lvalue cannot be based on a pointer. In particular, a compiler should only assume lvalue L is not based on pointer P if either (1) it understands everything about how `P` is used, or (2) it understands everything about how `L` is derived. In most cases where `restrict` is useful, at least one of those conditions will apply (often they both will). Attempts to optimize... – supercat Mar 10 '19 at 16:55
  • ...the harder cases will be be more difficult, and generally offer much less benefit, then focusing on the simple cases and treating all pointers whose derivation isn't fully understood as though they may be based upon any pointer whose usage isn't fully understood. There wasn't any need to have the Standard fully plumb the depths of such cases because analysis of them because implementations would have little reason to care. – supercat Mar 10 '19 at 16:56
0

After reading several comments, as well as @EricPostpischil's answer, I've tried to synthesize what I believe to be clearer, though a bit longer, wording to clarify things and answer the questions posed.

Original text:

In what follows, a pointer expression E is said to be based on object P if (at some sequence point in the execution of B prior to the evaluation of E) modifying P to point to a copy of the array object into which it formerly pointed would change the value of E.

Clarified text:

In what follows, a pointer expression E is said to be based on object P if changing P (with certain restrictions) before E is evaluated would cause E to evaluate to a different value. The restrictions are:

  • Trivial sanity restriction: The modification of P must occur at a sequence point.
  • P can only be modified to point to an identical copy of what it was pointing to originally.
    (And since in general, we can think of pointers always pointing to an array object - P can only be set to point at a copy of that array object).
einpoklum
  • 118,144
  • 57
  • 340
  • 684
0

Point 3 could be expressed in code as (roughly):

#define E(P)  ( (P) + 1 )   // put the expression you want to test here

extern T obj;    // T is some type
T copy = obj;

if ( E(&obj) != E(&copy) )
    printf("the expression in the macro is based on P")

The formal language definition used in the standard allows for E to be non-deterministic and other pathological cases (e.g. (P) + rand() % 5), whereas my example doesn't.

The standard version is that we compare what the result would be of E(&obj) with what the result would be of E(&copy) in the same context.

M.M
  • 138,810
  • 21
  • 208
  • 365
  • In most cases where `E` is linearly derived from `P`, changing `P` in any fashion would change `E`, and in most cases where it isn't, making `P` point to a copy of the data wouldn't change `E`. Linear derivation is a transitive property that's easy to recognize, since every pointer value other than a function call, an integer-to-pointer cast, or the result of `?:` will be linearly derived from exactly one other pointer value. The "formal definition" given in the Standard would coincide with "E is linearly derived from P" except in the hard cases where it's pretty much useless. – supercat Feb 27 '19 at 22:04