3

The wording of the C23 working draft on unsequenced functions is unclear to me. Among other properties, unsequenced functions have to be independent:

(6) An object X is observed by a function call

  • (6.1) if both synchronize,
  • (6.2) if X is not local to the call,
  • (6.3) if X has a lifetime that starts before the function call, and
  • (6.4) if an access of X is sequenced during the call;

the last value of X, if any, that is stored before the call is said to be the value of X that is observed by the call.

A function pointer value f is independent if

  • (6.5) for any object X that is observed by some call to f through an lvalue that is not based on a parameter of the call, then all accesses to X in all calls to f during the same program execution observe the same value;
  • (6.6) otherwise if the access is based on a pointer parameter, there shall be a unique such pointer parameter P such that any access to X shall be to an lvalue that is based on P.

A function definition is independent if the derived function pointer value is independent.

- N3096 $6.7.12.7 p6, re-formatted for the sake of readability

My question is whether functions like strlen can be independent. Consider this minimal implementation:

size_t strlen(const char* s) {
    size_t len = 0;
    for (; *s; ++s) { ++len; }
    return len;
}

Firstly, is *s considered to be access that is based on a parameter? I believe the access is based on a parameter, namely a pointer parameter, so only the restrictions of (6.6) are relevant.

However, (6.6) could be an issue. Notice that (6.5) says "all accesses to X in all calls to f", whereas (6.6) says "all accesses to X", which is broader, and may apply to access outside the function too. The following scenario is a problem then:

struct string {
    char data[N];
} str;
// ...
strlen(str.data); // A
str = ...;
strlen(str.data); // B

Not all accesses to str globally take place through a unique pointer parameter s to strlen. Some of them (str = ...) don't even involve pointers whatsoever. If my understanding is correct, this disqualifies strlen from being independent, and thus unsequenced.

In summary, are my interpretations correct, and strlen cannot be unsequenced? Is this perhaps just a wording issue, but it was intended to be a unsequenced?


Note on possible intent

The proposal for [[unsequenced]] claims that unlike GCC's [[gnu::const]], pointer parameters are supported. However, I am not certain whether the wording reflects that, and to what extent.

See N2956 5.8 Some differences with GCC const and pure

Jan Schultke
  • 17,446
  • 6
  • 47
  • 96
  • IMHO, `strlen` can still be independent under some use cases. i.e., if synchronized or if `str` is (to borrow from C++) a unique pointer, or is immutable... no? – Myst Aug 07 '23 at 15:03
  • @Myst the issue is that *independent* is a property of a function, not of individual function calls. `strlen` would have to be *independent* universally. – Jan Schultke Aug 07 '23 at 15:05
  • 1
    I don't think that use by other functions of `str.data` is a problem. But it does sound as if `strlen(str.data); strlen(std.data + 1);` breaks the rule, since both will find the same NUL character lvalue but do so starting at different pointer parameters. – Ben Voigt Aug 07 '23 at 15:05
  • 1
    @BenVoigt **X** refers to an object, not to a value. It should be fine if two accesses from different pointers give you the same value, it's just that all accesses of the same object have to be through the same unique pointer. – Jan Schultke Aug 07 '23 at 15:06
  • 2
    @JanSchultke: Look again, I am talking about access to an object (an lvalue, not a prvalue) – Ben Voigt Aug 07 '23 at 15:11
  • @BenVoigt I haven't really considered that yet, but I think you may be right. – Jan Schultke Aug 07 '23 at 15:19
  • Note well that the flavor of attributes you are asking about is new in C23, and the syntax is not backwards compatible with older versions of standard C. If you plan actually to use them then you will need a compiler that supports them. – John Bollinger Aug 07 '23 at 15:19
  • IMO gcc pure & const are much better considered and suit very well the main purpose - automatic parallelization. I personally cant see any other reason of having those attributes. I would rather like to see the `pure` attribute to the parameters as well instead this esoteric attribute (i mean [[unsequenced]]) – 0___________ Aug 07 '23 at 17:23
  • The wording of all of this is poor and needlessly obscure... As I understand it `[[unsequenced]]` simply means that the function is guaranteed (by the programmer) to have no side effects that would break the program in case of function call re-ordering optimizations. (And as such it is also re-entrant.) And that's it, that can be the only sensible use of the feature. – Lundin Aug 09 '23 at 06:56

1 Answers1

5

My question is whether functions like strlen can be independent.

As a preliminary matter, the wording of the relevant section of the spec is terrible. As in, having grammar not matched to the seeming intent in some places, and being grammatically incorrect in other places, in ways that make the intended meaning difficult to parse, and being a bit redundant. This rewrite better conveys what I think the spec means to say:


An object X is observed by a function call if

  • X has a lifetime that starts before the call, and
  • X and the call synchronize, and
  • an access of X is sequenced during the call.

In that case, if any store to X happens before the call then the value written by the last such store is said to be the value of X that is observed by the call. Otherwise, the initial value of X is the one observed by the call.

A function pointer value f is independent if for every object X that is observed by a call C to *f,

  • if any access by which such a C observes X is via an lvalue that is based on a pointer parameter of *f, then there is a unique pointer parameter P of *f such that every access to X by such a C shall be via an lvalue that is based on P;

  • otherwise, every such C during the same program execution that observes the value of X observes the same value.

A function definition is independent if the derived function pointer value is independent.


The use of a function pointer as the primary subject of the definition is drawn from the original text. I take it to be meant to clarify that independence is about a function as identified by its address, as opposed to by name or implementation. Thus, it could be that a program contains two static functions with the same name and identical code, with one independent and the other not.

It is worth noting that a function or function pointer being independent -- if that is recognized by the compiler -- speaks to possible reordering of the function call relative to other operations, which is the point of being unsequenced. But independence is only about a function's reads, so it is not enough on its own to enable reordering of calls.

Consider this minimal implementation:

size_t strlen(const char* s) {
    size_t len = 0;
    for (; *s; ++s) { ++len; }
    return len;
}

Firstly, is *s considered to be access that is based on a parameter?

Short answer: yes.

Long answer: The *s is an lvalue. It appears in a context where lvalue conversion will be performed. That lvalue conversion performs a read of the referenced object, which is an access to that object. I see no room for any interpretation that *s is not based on s, which is a parameter of the function.

However, (6.6) could be an issue. Notice that (6.5) says "all accesses to X in all calls to f", whereas (6.6) says "all accesses to X", which is broader, and may apply to access outside the function too.

Sort of. The provision is not about pointer values, but rather about the function parameter by which pointers are fed to the function. Therefore, interpreting it as applying outside the context of calls to f would mean that the object in question could not be accessed at all outside of f without disqualifying f as independent. I am confident that that is not the intent, and my above rewrite of the provision is clearer about that.

In summary, are my interpretations correct, and strlen cannot be unsequenced? Is this perhaps just a wording issue, but it was intended to be a unsequenced?

If your interpretation were correct then substantially no function would be unsequenced (because a function must be independent to be unsequenced).

As I interpret the spec, the strlen() definition provided defines a function that is unsequenced. And I note that that's independent of whether it actually carries the [[unsequenced]] annotation.

John Bollinger
  • 160,171
  • 8
  • 81
  • 157
  • In your estimation, the "if the access is based on a pointer parameter, there shall be a unique such pointer parameter P" is the same requirement as for the `restrict` keyword? – Ben Voigt Aug 07 '23 at 17:59
  • 1
    @BenVoigt, I am interpreting that condition as stronger than `restrict`. For instance, I think it means that if an execution of the program contains evaluations of both `foo(&x, &y)` and `foo(&y, &x)`, and `foo` accesses the referents of both its arguments, then `foo` is not independent because on one call it access `x` via its first parameter, and on another it accesses `x` via its second parameter. But the spec's wording is terrible, so it's possible that what's meant is just that the function has behavior that would be consistent with `restrict` qualification. – John Bollinger Aug 07 '23 at 18:15
  • 1
    I've found solitude in footnote 196. It says that an unsequenced function "can be executed as late as the arguments and the objects they possibly target are unchanged". This implies that `strlen` can be unsequenced, it's just that the modification of the string by the outside world is a "cut-off point for sequencing liberty". See https://open-std.org/JTC1/SC22/WG14/www/docs/n3096.pdf. It would be worth incorporating that note into the answer. – Jan Schultke Aug 07 '23 at 22:55
  • I agree that the wording of this section is terrible, and it was extremely hard to parse for me, without applying my formatting changes first. This notes in that section -though they aren't normative- help a lot with understanding it. – Jan Schultke Aug 07 '23 at 22:58