Is this behavior of clang standard compliant?

Question

This is going to be a long, language lawyerish question, so I'd like to quickly state why I find it relevant. I am working on a project where strict standard compliance is crucial (writing a language that compiles to C). The example I am going to give seems like a standard violation on the part of clang, and so, if this is the case, I'd like to confirm it.

gcc says that a conditional with a pointer to a restrict qualified pointer can not co-inhabit a conditional statement with a void pointer. On the other hand, clang compiles such things fine. Here is an example program:

#include <stdlib.h>

int main(void){
   int* restrict* A = malloc(8);
   A ? A : malloc(8);
   return 0;
   }

For gcc, the options -std=c11 and -pedantic may be included or not in any combination, likewise for clang and the options -std=c11 and -Weverything. In any case, clang compiles with no errors, and gcc gives the following:

tem-2.c: In function ‘main’:
tem-2.c:7:2: error: invalid use of ‘restrict’
  A ? A : malloc(8);
  ^

The c11 standard says the following with regard to conditional statements, emphasis added:

6.5.15 Conditional operator

...

One of the following shall hold for the second and third operands:

— both operands have arithmetic type;

— both operands have the same structure or union type;

— both operands have void type;

— both operands are pointers to qualified or unqualified versions of compatible types;

— one operand is a pointer and the other is a null pointer constant; or

— one operand is a pointer to an object type and the other is a pointer to a qualified or unqualified version of void.

...

If both the second and third operands are pointers or one is a null pointer constant and the other is a pointer, the result type is a pointer to a type qualified with all the type qualifiers of the types referenced by both operands. Furthermore, if both operands are pointers to compatible types or to differently qualified versions of compatible types, the result type is a pointer to an appropriately qualified version of the composite type; if one operand is a null pointer constant, the result has the type of the other operand; otherwise, one operand is a pointer to void or a qualified version of void, in which case the result type is a pointer to an appropriately qualified version of void.

...

The way I see it, the first bold portion above says that the two types can go together, and the second bold portion defines the result to be a pointer to a restrict qualified version of void. However, as the following states, this type can not exist, and so the expression is correctly identified as erroneous by gcc:

6.7.3 Type qualifiers, paragraph 2

Types other than pointer types whose referenced type is an object type shall not be restrict-qualified.

Now, the problem is that a "shall not" condition is violated by this example program, and so is required to produce an error, by the following:

5.1.1.3 Diagnostics, paragraph 1

A conforming implementation shall produce at least one diagnostic message (identified in an implementation-defined manner) if a preprocessing translation unit or translation unit contains a violation of any syntax rule or constraint, even if the behavior is also explicitly specified as undefined or implementation-defined. Diagnostic messages need not be produced in other circumstances.

It seems clang is not standard compliant by treating an erroneous type silently. That makes me wonder what else clang does silently.

I am using gcc version 5.4.0 and clang version 3.8.0, on an x86-64 Ubuntu machine.

You might be right... but if you cast the result of the `malloc()` in the conditional so that the third operand is no longer a pointer to void, the error disappears. — Dmitri, Jul 28 '16 at 21:06
If I were you, I would be inclined to avoid using `restrict` in the C code emitted by my project's to-C compiler. How much additional optimization could be enabled via `restrict` qualification, how much would actually be performed by any given compiler, and how much more performant the result might be are all unclear. On the other hand, by using `restrict` qualifiers, your code takes on additional obligations that it must satisfy to avoid undefined behavior, and not all of these can be checked by the compiler. I just don't see the reward, if any, justifying the risk. — John Bollinger, Jul 28 '16 at 22:02
@JohnBollinger, the language is very performance oriented, so I do think it would probably be unacceptable to take the performance hit. I have written a formal model of an exact way to use a subset of C for the emitted code which, together with constraints defined for the language, formally proves adherence to the definition of restrict in the C standard. In other words, requiring this small list of rules for the programmer allows the compiler to do all else to ensure correctness. You are right about the extra obligations though, it has been a lot of work! — Kyle, Jul 28 '16 at 22:16

score 6 · Accepted Answer · answered Jul 28 '16 at 21:24

Yes it looks like a bug.

Your question more briefly: can void be restrict qualified? Since void is clearly not a pointer type, the answer is no. Because this violates a constraint, the compiler should give a diagnostic.

I was able to trick clang to confess its sins by using a _Generic expression

puts(_Generic(A ? A : malloc(8), void* : "void*"));

and clang tells me

static.c:24:18: error: controlling expression type 'restrict void *' not compatible with any generic association type
     puts(_Generic(A ? A : malloc(8), void* : "void*"));

which shows that clang here really tries to match a nonsense type restrict void*.

Please file them a bug report.

score 0 · Answer 2 · answered Aug 02 '16 at 19:43

0

While a compiler could satisfy all obligations surrounding restrict by ignoring the qualifier altogether, a compiler which wants to keep track of what it is or is not allowed to do needs to keep track of which pointers hold copies of restrict pointers. Given something like:

int *foo;
int *bar;
int wow(int *restrict p)
{
  foo = p;
  ...
  *p = 123;
  *foo = 456;
  *p++;
  *bar = 890;
  return *p;
}

since foo is derived from p, a compiler must allow for accesses made via foo to alias accesses via p. A compiler need not make such allowances for accesses made via bar, since that is known not to hold an address derived from p.

The rules surrounding restrict get murky in cases where a pointer may or may not be derived from another. A compiler would certainly be allowed to simply ignore a restrict qualifier in cases where it can't track all of the pointers derived from a pointer; I'm not sure if any such cases would invoke UB even if nothing ever modifies the storage identified by the pointer. If a syntactic construct is structurally guaranteed to invoke UB, having a compiler squawk may be more useful than having it act in an arbitrary fashion (though having a compiler simply ignore any restrict qualifiers it can't fully handle might be more useful yet).

answered Aug 02 '16 at 19:43

supercat

77,689
9
166
211

Thanks for your response, but I'm not sure I understand. The problem is that the standard requires a diagnostic message in the case of a violation of syntax, which includes the use of an explicitly disallowed type. Clang does not provide that message. No compliant approach to the restrict qualifier alleviates the implementation of the responsibility of providing that message. In other words, the qualifier can be ignored by the implementation, but only when used in syntactically correct settings; incorrect uses would need to be reported even by an implementation that ignores the qualifier. – Kyle Aug 03 '16 at 04:50
@Kyle: An "int*" is an object, as is an `int* restrict` the phrase "pointer types whose referenced type is an object type" essentially means "pointer that is not a function pointer", and an `int* restrict` clearly meets that requirement. The return from `malloc` is a `void*`. The conditional operator is being used on a pointer to an object (of type `int *restrict`) and a `void*`. I really don't see anything syntactically wrong here. The Standard lacks the terminology to say what I think the authors intended, which would be that... – supercat Aug 03 '16 at 14:22
...implementations have no obligation to process programs that use `restrict` in any fashion other than what's mandated by the Standard, and must issue a diagnostic if a syntax violation would prevent the implementation from processing the source file as defined by the Standard, *but* need not issue a diagnostic if `restrict` poses no impediment to processing. From what I can tell, the authors of the Standard wanted `restrict` to only impose a burden on compilers that could benefit from it, with other implementations being free to essentially ignore it. – supercat Aug 03 '16 at 14:29
The standard requires that all syntactic errors produce a diagnostic, I do not see anything in the standard supporting that they need not issue a diagnostic for syntactic errors which do not impede processing. In my example, the error is an expression with an invalid type. Perhaps one reason for this is that two implementations might not agree on what impedes processing, and so standard compliance fails to be well defined if a diagnostic is not required. For example, I could compile with Clang and think I have a compliant program, but when someone tries to compile it with gcc, it won't work. – Kyle Aug 03 '16 at 16:11
The standard allows processing of a non compliant program, for example one with syntactic errors like this one which do not impede processing, but does not require it. This is a separate issue from the requirement to produce a diagnostic; the implementation must produce the diagnostic whether or not it goes on to compile the program. – Kyle Aug 03 '16 at 16:13
@Kyle: Variations in implementation-defined behaviors could cause a program to be well-formed on some implementations but not others, so the fact that one implementation accepts a program without a diagnostic does not imply that others will accept it at all. – supercat Aug 03 '16 at 16:20
A standard compliant program is guaranteed to be accepted by a compliant implementation. If standard compliance were based on vague notions like the allowance of a syntactic violation that does not impede translation, then it would be impossible to guarantee the acceptance of standard compliant programs, as the notion of standard compliance would be made vague by that allowance. – Kyle Aug 03 '16 at 16:33
The issue is that it is standard compliance that ensures acceptance by any compliant implementation, and so the standard requires that an implementation help the programmer achieve standard compliance by determining if the syntax is at least standard compliant. Clang does not satisfy this requirement in this case, and so does not help the programmer avoid mistakenly producing a program that is not standard compliant. The standard would make sense if it did not place that diagnostic requirement on an implementation, but it does place that requirement, and it has reasons to do so. – Kyle Aug 03 '16 at 16:36
@Kyle: The authors of the Standard likely intended that quality compilers include standards-validation functionality, but since an implementation which unconditionally spit out the message "This is a diagnostic" and nothing else would fulfill its diagnostic obligations with regard to everything but #error directives, I regard that aspect of the Standard as pretty meaningless. – supercat Aug 03 '16 at 16:51
Of course there are perverse ways to fulfill any standard obligation in a useless way, but it is still a problem to not comply with the standard. It is standard justified to interpret a silent compilation as a validation of all syntax. If an implementation fails to do its part, it could leave bad syntax in code justly thought not to have bad syntax, which can cause problems when other compilers get involved. – Kyle Aug 03 '16 at 17:16
@Kyle: It's common and expected for compilers to have compliant and non-compliant modes; feeding clang an invalid command line switch would force it to issue at least one diagnostic, thus satisfying this part of the Standard. I'm still curious, though, what you see as being a constraint violation; the problem I see is with the failure to accept the use of ?: with both a restrict-qualified pointer and a non-qualified malloc return. – supercat Aug 03 '16 at 17:26
That is the problem. The syntax violation is that the conditional expression has type `restrict void*`, which is a type not allowed by the standard, per 6.7.3 in my question. And this issue is present in the compliant mode of clang, with `-std=c11`, so the non compliant mode thing is not relevant here. – Kyle Aug 03 '16 at 17:37
1

@Kyle: I think I see the issue. 6.5.15 says that the `? :`operator must yield an "*appropriately*-qualified" pointer, but I don't know that anything indicates that `restrict` would be inherited in this scenario, especially given that applying `restrict` to the target of `void*` would be *in*appropriate. – supercat Aug 03 '16 at 18:55
Good point. That is a pretty nontrivial assumption though, in the interpretation of "appropriately qualified". Under that interpretation, there is no standard violation. But a rule like that certainly seems like it should be made more explicit in the standard, as it treats one qualifier uniquely. There is a lot of vagueness in the standard, and I guess this is one case where you could either say the behavior of clang is noncompliant (according to what seems the most straightforward interpretation), or it is compliant (according to what seems the most generous interpretation). – Kyle Aug 03 '16 at 20:31
@Kyle: The `restrict` qualifier is unique in many ways. Given `int *restrict p;`, the declaration `int *q = p;` will cause `q` to have the same "restrictions" as `p`, while `int *restrict q=p;` would impose restrictions on the use of **`p`** as long as any pointers exist that are derived from the one stored into `q`, so it's not clear that `restrict` should be inherited in that context. – supercat Aug 03 '16 at 20:54
The rules associated with `restrict` do seem overly complicated when applied outside the usage pattern where it is most often employed and most useful (applied to parameters, and occasionally local variables, both of which have well-defined lifetimes). I wonder how much benefit it provides in other contexts, and whether it's worthwhile for compilers to bother with it in such contexts. – supercat Aug 03 '16 at 21:08

Is this behavior of clang standard compliant?

2 Answers2