0

Consider the following C code:

#include <stdio.h>

int x = 5;
int y = x-x+10;
int z = x*0+5;
int main()
{
  printf("%d\n", y);
  printf("%d\n", z);
  return 0;
}

The ANSI C90 standard states "All the expressions for an object that has static storage duration [...] shall be constant expressions" (6.5.7 constraint 3).

Clearly the initializers for y and z are not constant expressions. And indeed, trying to compile the above C code with clang main.c or clang -ansi main.c gives an error for this reason.

However, compiling with gcc main.c or even gcc main.c -ansi -pedantic -Wextra -Wall gives no errors at all, and runs, printing 10 and 5.

On the other hand, trying something like the following:

#include <stdio.h>

int x = 5;
int main()
{
  int y[x-x+2];
  printf("%lu\n", sizeof(y));
  return 0;
}

gives a warning when compiled with gcc -ansi -pedantic ... or clang -ansi -pedantic ....

So gcc randomly performs the mathematically correct cancellations in order to pretend that something is a constant expression, even when asked not to (-ansi). Why is this? Is this a bug?

By the way, my gcc version is 9.4.0 and my clang version is 10.0.0-4.

  • 1
    the result of `sizeof` is `size_t` which [must be printed using `%zu`](https://stackoverflow.com/q/940087/995714). Using the wrong format specifier invokes UB – phuclv Jun 11 '22 at 03:22
  • @phuciv When i try that i get "warning: ISO C90 does not support the ‘z’ gnu_printf length modifier". – WacfeldWang Jun 11 '22 at 03:26
  • 1
    why on earth do you use C90? The default version in gcc 9.x is gnu11 which is already 11 years old. And in such ancient versions you need to cast to `int` and print using `%d` – phuclv Jun 11 '22 at 03:31
  • Related/duplicate: [Why are const qualified variables accepted as initializers on gcc?](https://stackoverflow.com/questions/68252570/why-are-const-qualified-variables-accepted-as-initializers-on-gcc) and [Why "initializer element is not a constant" is... not working anymore?](https://stackoverflow.com/questions/54135942/why-initializer-element-is-not-a-constant-is-not-working-anymore) – user17732522 Jun 11 '22 at 03:33
  • @phuclv I am trying to write a C compiler, so for simplicity I am following the oldest standard with the fewest features. But thanks for the information on printing size_t, I will keep that in mind. – WacfeldWang Jun 11 '22 at 03:36
  • @user17732522 Both of those examples use the const qualifier, which my example does not have. Moreover `int x; int a=x` does give an error when compiled with the options above. The question here is why `x-x` and `x*0` are evaluated as 0 instead of an error. – WacfeldWang Jun 11 '22 at 03:42
  • @WacfeldWang ok, strike the "duplicate" part. – user17732522 Jun 11 '22 at 03:43
  • "mathematically correct cancellations in order to pretend that something is a constant expression" --> There is no pretending. `int y[x-x+2];` is fine if VLA supported. – chux - Reinstate Monica Jun 11 '22 at 03:52
  • @chux-ReinstateMonica VLA is not supported in C90. `int x = 5; int y[x];` gives a warning. But the compiler catches that for `int y[x-x+2]` anyway. The point was it allows the cancellation in a constant initializer, but correctly disallows it as an array length. – WacfeldWang Jun 11 '22 at 04:00

1 Answers1

1

Without looking, there's one simple explanation: the code that checks whether the initializer is an acceptable constant expression operates on an internal representation after a pass that does arithmetic simplification, so it never "sees" your x-x or x*0.

This probably makes the implementation of the rule in GCC simpler: it just has to ask "is this node one that represents a constant?" rather than "is this tree one that I could evaluate as a constant later on?". It also facilitates the behavior that they probably want for the later standards (see below), and a special case for -ansi would probably add an undesirable amount of code complexity.

Is it a bug? Arguably. But it's one with such a small impact that it's not especially likely to get fixed. It works correctly for valid code, and it errors correctly on "really" invalid code that could actually cause a problem. It only deviates from the standard in a fairly harmless way, and only for C90 (since the C99 and later standards say "an implementation may accept other forms of constant expressions", which gives GCC latitude to allow an expression that mentions things not on the laundry list, as long as it has a constant value).

hobbs
  • 223,387
  • 19
  • 210
  • 288