TL;DR
The short answer is that this particular construct is not allowed by the C11 standard -- or any other C standard going back to ANSI C (1989) -- but it is accepted as a compiler extension by many, though not all, modern C compilers. In the particular case of GCC, you need to not use -pedantic
(or -pedantic-errors
), which would cause a strict interpretation of the C standard. (Another workaround is described below.)
Note: Although you can spell -pedantic
with a W
, it is not like many -W
options, in that it does not only add warning messages: What it does is:
Issue all the warnings demanded by strict ISO C and ISO C++; reject all programs that use forbidden extensions, and some other programs that do not follow ISO C and ISO C++.
Workarounds
It does not appear to be possible to suppress this error using a GCC #pragma
, or at least the ones that I tried didn't have any effect. It is possible to suppress it for a single declaration using the __extension__
extension, but that seems to just be trading one incompatibility for another, since you would then need to find a way to remove (or macro expand away) __extension__
for other compilers.
Quoting the GCC manual:
-pedantic
and other options cause warnings for many GNU C extensions. You can prevent such warnings within one expression by writing __extension__
before the expression. __extension__
has no effect aside from this.
On the GCC versions I had handy, the following worked without warnings even with -pedantic
:
__extension__ static struct example_s { int i; } example[];
Probably your best bet it to just remove -pedantic
from the build options. I don't believe that -pedantic
is actually that useful; it's worth reading what the GCC manual has to say about it. In any event, it is doing its job here: the documented intent is to ban extensions, and that's what it is doing.
Language-lawyering
The language-lawyer justification for the above, taking into account some of the lengthy comment threads:
Definitions
An external declaration is a declaration at file scope, outside of any function definition. This shouldn't be confused with external linkage, which is a completely different usage of the word. The standard calls external declarations "external" precisely because they are outside any function definitions.
A translation unit is, thus, a sequence of external-declaration. See §6.9.
If an external declaration is also a definition -- that is, it is either a function declaration with a body or an object declaration with an initializer -- then it is referred to as an external definition.
A type is incomplete at a point in a program where there is not "sufficient information to determine the size of objects of that type" (§6.2.5p1), which includes "an array type of unknown size" (§6.2.5p22). (I'll return to this paragraph later.) (There are other ways for a type to be incomplete, but they're not relevant here.)
An external declaration of an object is a tentative definition (§6.9.2) if it is not a definition and is either marked static
or has no storage-class specifier. (In other words, extern
declarations are not tentative.)
What's interesting about tentative definitions is that they might become definitions. Multiple declarations can be combined with a single definition, and you can also have multiple declarations (in a translation unit) without any definition (in that translation unit) provided that the symbol has external linkage and that there is a definition in some other translation unit. But in the specific case where there is no definition and all declarations of a symbol are tentative, then the compiler will automatically insert a definition.
In short, if a symbol has any (external) declaration with an explicit extern
, it cannot qualify for automatic definition (since the explicitly-marked declaration is not tentative).
A brief detour: the importance of the linkage of the first declaration
Another curious feature: if the first declaration for an object is not explicitly marked static
, then no declaration for that object can be marked static
, because a declaration without a storage class is considered to have external linkage unless the identifier has already been declared to have internal linkage (§6.2.2p5), and an identifier cannot be declared to have internal linkage if it has already been declared to have external linkage (§6.2.2p7). However, if the first declaration for an object is explicitly static
, then subsequent declarations have no effect on its linkage. (§6.2.2p4).
What this all meant for early implementers
Suppose you're writing a compiler on an extremely resource-limited CPU (by modern standards), which was basically the case for all early compiler writers. When you see an external declaration for a symbol, you need to either give it an address within the current translation unit (for symbols with internal linkage) or you need to add it to the list of symbols you're going to let the linker handle (for symbols with external linkage). Since the linker will assign addresses to external symbols, you don't yet need to know what their size is. But for the symbols you're going to handle yourself, you will want to immediately give them an address (within the data segment) so that you can generate machine code referencing the data, and that means that you do need to know what size these objects are.
As noted above, you can tell whether a symbol is internally or externally linked when you first see a declaration for it, and it must be declared before it is used. So by the time you need to emit code using the symbol, you can know whether to emit code referencing a specific known offset within the data segment, or to emit a relocatable reference which will be filled in later by the linker.
But there's a small problem: What if the first declaration is incomplete? That's not a problem for externally linked symbols, but for internally-linked symbols it prevents you from allocating it to an address range since you don't know how big it is. And by the time you find out, you might have had to have emitted code using it. To avoid this problem, it's necessary that the first declaration of an internally-linked symbol be complete. In other words, there cannot be a tentative declaration of an incomplete symbol, which is what the standard says in §6.9.2p3:
If the declaration of an identifier for an object is a tentative definition and has internal linkage, the declared type shall not be an incomplete type.
A bit of paleocybernetics
That's not a new requirement. It was present, with precisely the same wording, in §3.7.2 of C89. And the issue has come up several times over the years in the comp.lang.c
and comp.std.c
Usenix groups, without ever attracting a definitive explanation. The one I provided above is my best guess, combined with hints from the following discussions:
And it's also come up a few times on Stackoverflow:
A final doubt
Although no-one in any of the above debates has mentioned it, the actual wording of §6.2.5p22 is:
An array type of unknown size is an incomplete type. It is completed, for an identifier of that type, by specifying the size in a later declaration (with internal or external linkage).
That definitely seems to contradict §6.9.2p3, since it contemplates a "later declaration with interal linkage", which would not be allowed by the prohibition on tentative definitions with internal linkage and incomplete type. This wording is also contained word-for-word in C89 (in §3.1.2.5), so if this is an internal contradiction, it's been in the standard for 30 years, and I was unable to find a Defect Report mentioning it (although DR010 and DR016 hover around the edges).
Note:
For C89, I relied on this file saved in the Wayback Machine but I have no proof that it's correct. (There are other instances of this file in the archive, so there is some corroboration.) When the ISO actually released C90, the sections were renumbered. See this information bulletin, courtesy wikipedia.