13

I have this extremely trivial piece of C code:

static int arr[];
int main(void) {
    *arr = 4;
    return 0;
}

I understand that the first statement is illegal (I've declared a file-scope array with static storage duration and file linkeage but no specified size), but why is it resulting in a linker error? :

/usr/bin/ld: /tmp/cch9lPwA.o: in function `main':
unit.c:(.text+0xd): undefined reference to `arr'
collect2: error: ld returned 1 exit status

Shouldn't the compiler be able to catch this before the linker?

It is also strange to me that, if I omit the static storage class, the compiler simply assumes array is of length 1 and produces no error beyond that:

int arr[];
int main(void) {
    *arr = 4;
    return 0;
}

Results in:

unit.c:5:5: warning: array 'arr' assumed to have one element
 int arr[];

Why does omitting the storage class result in different behavior here and why does the first piece of code produce a linker error? Thanks.

bool3max
  • 2,748
  • 5
  • 28
  • 57
  • 3
    Nonsense code - nonsense messages – 0___________ Aug 28 '18 at 23:18
  • 2
    Interesting fact: It compiles with clang, but not gcc. – klutt Aug 28 '18 at 23:39
  • 3
    I'm surprised at clang's behavior given [6.9.2p3](https://port70.net/~nsz/c/c11/n1570.html#6.9.2p3) "If the declaration of an identifier for an object is a tentative definition and has internal linkage, the declared type shall not be an incomplete type." – aschepler Aug 29 '18 at 00:30
  • 3
    The end of [6.9.2p2](https://port70.net/~nsz/c/c11/n1570.html#6.9.2p2) gives the behavior if a tentative definition is unresolved by the end of the translation unit---it is instantiated as "a file scope declaration of that identifier [...] with an initializer equal to 0"; The case of a missing array dimension is explicitly covered in an example at [6.9.2p5](https://port70.net/~nsz/c/c11/n1570.html#6.9.2p5) where the implicit initializer is interpreted to give it dimension 1 (as in the OP's second case). Perhaps `gcc` failed to actually render the final definition because it lacked the dim? – lockcmpxchg8b Aug 29 '18 at 00:57
  • 1
    The version with the static definition fails to compile with `gcc -pedantic`. – Jim Janney Aug 29 '18 at 06:06
  • 1
    If you supply the implicit interpretation explicitly at the end of the file, then gcc accepts it. I.e., add `static int arr[1] = {0};` as the last line in unit.c. So it seems like `gcc` really is just failing to resolve the tentative definition. File it as a bug and see what they say. – lockcmpxchg8b Aug 30 '18 at 02:47
  • 1
    @lockcmpxchg8b 6.9.2p3 "If the declaration of an identifier for an object is a tentative definition and has internal linkage, the declared type shall not be an incomplete type". – n. m. could be an AI Sep 02 '18 at 10:16
  • @n.m. interesting point. And clearly from [6.2.2p3](https://port70.net/~nsz/c/c11/n1570.html#6.2.2p3) the OP's variable has internal linkage. I don't know how to rectify that with either p2, or the example at p5. Especially since they say "declared type" and not "composite type". – lockcmpxchg8b Sep 02 '18 at 13:42
  • @lockcmpxchg8b, the reconciliation is that the example in 6.9.2/5 applies to a declaration of an object with *external* linkage. Per 6.9.2/3, the corresponding declaration with internal linkage is non-conforming, so the situation described in 6.9.2/5 can never be achieved in that case. GCC rejects the latter when the `-pedantic` option is in effect. – John Bollinger Sep 04 '18 at 13:37
  • @lockcmpxchg8b - see more [discussion about an alternative (and, IMHO, much more harmonious) interpretation for section 6.9.2.3](https://stackoverflow.com/a/55109661/4025095). – Myst Mar 12 '19 at 19:17

2 Answers2

12

Empty arrays static int arr[]; and zero-length arrays static int arr[0]; were gcc non-standard extensions.

The intention of these extensions were to act as a fix for the old "struct hack". Back in the C90 days, people wrote code such as this:

typedef struct
{
  header stuff;
  ...
  int data[1]; // the "struct hack"
} protocol;

where data would then be used as if it had variable size beyond the array depending on what's in the header part. Such code was buggy, wrote data to padding bytes and invoked array out-of-bounds undefined behavior in general.

gcc fixed this problem by adding empty/zero arrays as a compiler extension, making the code behave without bugs, although it was no longer portable.

The C standard committee recognized that this gcc feature was useful, so they added flexible array members to the C language in 1999. Since then, the gcc feature is to be regarded as obsolete, as using the C standard flexible array member is to prefer.

As recognized by the linked gcc documentation:

Declaring zero-length arrays in other contexts, including as interior members of structure objects or as non-member objects, is discouraged.

And this is what your code does.

Note that gcc with no compiler options passed defaults to -std=gnu90 (gcc < 5.0) or -std=gnu11(gcc > 5.0). This gives you all the non-standard extensions enabled, so the program compiles but does not link.

If you want standard compliant behavior, you must compile as

gcc -std=c11 -pedantic-errors

The -pedantic flag disables gcc extensions, and the linker error switches to a compiler error as expected. For an empty array as in your case, you get:

error: array size missing in 'arr'

And for a zero-length array you get:

error: ISO C forbids zero-size array 'arr' [-Wpedantic]


The reason why int arr[] works, is because this is an array declaration of tentative definition with external linkage (see C17 6.9.2). It is valid C and can be regarded as a forward declaration. It means that elsewhere in the code, the compiler (or rather the linker) should expect to find for example int arr[10], which is then referring to the same variable. This way, arr can be used in the code before the size is known. (I wouldn't recommend using this language feature, as it is a form of "spaghetti programming".)

When you use static you block the possibility to have the array size specified elsewhere, by forcing the variable to have internal linkage instead.

Lundin
  • 195,001
  • 40
  • 254
  • 396
  • It is possible that these non-standard extensions were removed from the default gnu11 somewhere along the way, in newer gcc versions. I don't know that. – Lundin Sep 04 '18 at 13:05
  • Thank you for the info and interesting bits of history! – bool3max Sep 04 '18 at 13:29
  • I know this is tagged gcc, but just to give another example, winapi (and thus Microsoft's C compiler) makes use of the "struct hack" often as well. It's not limited to the GNU world and unfortunately the APIs requiring the "struct hack" in Windows aren't going away any time soon; I'd imagine this impacts WINE as well. – jrh Sep 07 '18 at 14:30
0

Maybe one reason for this behavior is that the compiler issues a warning resulting in a non-accessed static variable and optimizes it away - the linker will complain!

If it is not static, it cannot simply be ignored, because other modules might reference it - so the linker can at least find that symbol arr.