I guess the answer is "no", but from a compiler point of view, I don't understand why.
I made a very simple code which freaks out compiler diagnostics quite badly (both clang and gcc), but I would like to have confirmation that the code is not ill formatted before I report mis-diagnostics. I should point out that these are not compiler bugs, the output is correct in all cases, but I have doubts about the warnings.
Consider the following code:
#include <iostream>
int main(){
int b,a;
b = 3;
b == 3 ? a = 1 : b = 2;
b == 2 ? a = 2 : b = 1;
a = a;
std::cerr << a << std::endl;
}
The assignment of a
is a tautology, meaning that a
will be initialized after the two ternary statements, regardless of b
. GCC is perfectly happy with this code. Clang is slighly more clever and spot something silly (warning: explicitly assigning a variable of type 'int' to itself [-Wself-assign]
), but no big deal.
Now the same thing (semantically at least), but shorter syntax:
#include <iostream>
int main(){
int b,a = (b=3,
b == 3 ? a = 1 : b = 2,
b == 2 ? a = 2 : b = 1,
a);
std::cerr << a << std::endl;
}
Now the compilers give me completely different warnings. Clang doesn't report anything strange anymore (which is probably correct because of the parenthesis precedence). gcc is a bit more scary and says:
test.cpp: In function ‘int main()’:
test.cpp:7:15: warning: operation on ‘a’ may be undefined [-Wsequence-point]
But is that true? That sequence-point warning gives me a hint that coma separated statements are not handled in the same way in practice, but I don't know if they should or not.
And it gets weirder, changing the code to:
#include <iostream>
int main(){
int b,a = (b=3,
b == 3 ? a = 1 : b = 2,
b == 2 ? a = 2 : b = 1,
a+0); // <- i just changed this line
std::cerr << a << std::endl;
}
and then suddenly clang realized that there might be something fishy with a
:
test.cpp:7:14: warning: variable 'a' is uninitialized when used within its own initialization [-Wuninitialized]
a+0);
^
But there was no problem with a
before... For some reasons clang cannot spot the tautology in this case. Again, it might simply be because those are not full statements anymore.
The problems are:
- is this code valid and well defined (in all versions)?
- how is the list of comma separated statements handled? Should it be different from the first version of the code with explicit statements?
- is GCC right to report undefined behavior and sequence point issues? (in this case clang is missing some important diagnostics) I am aware that it says may, but still...
- is clang right to report that
a
might be uninitialized in the last case? (then it should have the same diagnostic for the previous case)
Edit and comments:
- I am getting several (rightful) comments that this code is anything but simple. This is true, but the point is that the compilers mis-diagnose when they encounter comma-separated statements in initializers. This is a bad thing. I made my code more complete to avoid the "have you tried this syntax..." comments. A much more realistic and human readable version of the problem could be written, which would exhibit wrong diagnostics, but I think this version shows more information and is more complete.
- in a compiler-torture test suite, this would be considered very understandable and readable, they do much much worse :) We need code like that to test and assess compilers. This would not look pretty in production code, but that is not the point here.