2

Consider the following function:

int bar(const int* __restrict x, void g())
{
    int result = *x;
    g();
    result += *x;
    return result;
}

Do we need to read twice from x because of the call to g()? Or is the __restriction enough to guarantee the invocation of g() does not access/does not alter the value at address x?

At this link we see the most popular compilers have to say about this (GodBolt; language standard C99, platform AMD64):

  • clang 7.0: Restriction respected.
  • GCC 8.3: No restriction.
  • MSVC 19.16: No restriction.

Is clang rightly optimizing the second read away, or isn't it? I'm asking both for C and C++ here, as the behavior is the same (thanks @PSkocik).

Related information and some notes:

einpoklum
  • 118,144
  • 57
  • 340
  • 684
  • Why not tag it [c]? It is a legit tag for this question (perhaps more legit than the C++ tag) and you'll get more eyeballs. – Petr Skocik Feb 25 '19 at 22:19
  • @PSkocik: After seeing your GodBolt example - editing accordingly. – einpoklum Feb 26 '19 at 10:57
  • For the optimization across `g()` to be proper, the compiler would have to be certain that it would have noticed any action that could cause the address of `x` to be exposed to the outside world during the execution of `bar`. In this case there are clearly no such actions, but unless a compiler would be equipped to notice all such actions, it's better to presume that a reference might have leaked when it hasn't than to pretend it can't have leaked in cases where it has. – supercat Feb 26 '19 at 23:36
  • @supercat: What you're saying is true if we _don't_ use `restrict`. If we _do_, the compiler doesn't have to be certain of that, it is required to "take my word for it". – einpoklum Feb 26 '19 at 23:59
  • @einpoklum: Function `g()` would be allowed to access the same storage as `*x` if and only if it uses a pointer based upon `x` (and not merely the pointer from which `x` was based) to do so. A compiler would only be entitled to assume that `g()` could not access `*x` if it could not possibly have legitimately received such a pointer. In the absence of `__restrict`, a compiler would have to allow for the possibility that the address might have been exposed to outside code before the invocation of `bar`. – supercat Feb 27 '19 at 01:39

1 Answers1

1

I think this is effectively a C question, since C is effectively the language that has restrict, with a formal spec attached to it.

The part of the C standard that governs the use of restrict is 6.7.3.1:

1 Let D be a declaration of an ordinary identifier that provides a means of designating an object P as a restrict-qualified pointer to type T.

2 If D appears inside a block and does not have storage class extern, let B denote the block. If D appears in the list of parameter declarations of a function definition, let B denote the associated block. Otherwise, let B denote the block of main (or the block of whatever function is called at program startup in a freestanding environment).

3 In what follows, a pointer expression E is said to be based on object P if (at some sequence point in the execution of B prior to the evaluation of E) modifying P to point to a copy of the array object into which it formerly pointed would change the value of E.137) Note that ''based'' is defined only for expressions with pointer types.

4 During each execution of B, let L be any lvalue that has &L based on P. If L is used to access the value of the object X that it designates, and X is also modified (by any means), then the following requirements apply: T shall not be const-qualified. Every other lvalue used to access the value of X shall also have its address based on P. Every access that modifies X shall be considered also to modify P, for the purposes of this subclause. If P is assigned the value of a pointer expression E that is based on another restricted pointer object P2, associated with block B2, then either the execution of B2 shall begin before the execution of B, or the execution of B2 shall end prior to the assignment. If these requirements are not met, then the behavior is undefined.

5 Here an execution of B means that portion of the execution of the program that would correspond to the lifetime of an object with scalar type and automatic storage duration associated with B.

The way I read it, the execution of g() falls under the execution of the bar's block, so g() is disallowed from modifying *x and clang is right to optimize out the second load (IOW, if *x refers to a non-const global, g() must not modify that global).

Petr Skocik
  • 58,047
  • 6
  • 95
  • 142
  • @einpoklum Based-on here should mean derived from by means of some (possibly none) pointer arithmetic but without dereferencing. Given `T *restrict P;`, `P` , `P+2`, `P-42` are all based on `P`. The reason for why the optimization ought to be ok should be mainly in (4.)... Unfortunately, this part of the C standard is written in a super cryptic way, IMO. – Petr Skocik Feb 25 '19 at 23:26
  • The way clang and gcc process `restrict`, even an expression like `p[1]` won't be regarded as being recognizably based upon `p` in contexts where some pointer not based upon `p` is known to equal `p+1`. – supercat Oct 29 '19 at 19:35