3

The GCC compiler and the Clang compilers behave differently, where the Clang allows a static variable to be declared before it is defined, while the GCC compiler treats the declaration (or "tentative definition") as a definition.

I believe this is a bug in GCC, but complaining about it and opening a bug report won't solve the problem that I need the code to compile on GCC today (or yesterday)...

Heres a fast example:

static struct example_s { int i; } example[];

int main(void) {
  fprintf(stderr, "Number: %d\n", example[0].i);
  return 0;
}

static struct example_s example[] = {{1}, {2}, {3}};

With the Clang compiler, the program compiles and prints out:

Number: 1

However, with GCC the code won't compile and I get the following errors (ignore line numbers):

src/main2.c:26:36: error: array size missing in ‘example’
 static struct example_s { int i; } example[];
                                    ^~~~~~~
src/main2.c:33:25: error: conflicting types for ‘example’
 static struct example_s example[256] = {{1}, {2}, {3}};
                         ^~~~~~~
src/main2.c:26:36: note: previous declaration of ‘example’ was here
 static struct example_s { int i; } example[];

Is this a GCC bug or a Clang bug? who knows. Maybe if you're on one of the teams you can decide.

As for me, the static declaration coming before the static definition should be (AFAIK) valid C (a "tentative definition", according to section 6.9.2 of the C11 standard)... so I'm assuming there's some extension in GCC that's messing things up.

Any way to add a pragma or another directive to make sure GCC treats the declaration as a declaration?

Myst
  • 18,516
  • 2
  • 45
  • 67
  • 1
    what version of gcc? I can't reproduce your error on godbolt. – AShelly Mar 11 '19 at 21:11
  • Curious, why code with `static type obj[]; ... functions... static type obj[3];` versus just `static type obj[3]; ...functions`? – chux - Reinstate Monica Mar 11 '19 at 23:15
  • @chux , as you can see in [this initial HPACK implementation draft](https://github.com/boazsegev/facil.io/blob/9f8953700c0f6d7969e36de3a826bce24dea797f/lib/facil/http/parsers/hpack.h), the actual data is very long (~800 lines) and would fit better at the end of the file rather than the beginning. – Myst Mar 11 '19 at 23:17
  • Hmmm, looks like in standard C a `static` "tentative definition" is not allowed. Looks useful though, as your code suggests. – chux - Reinstate Monica Mar 11 '19 at 23:23
  • @chux , sure a `static` "tentative definition" is allowed, section 6.9.2.2 reads "A declaration of an identifier for an object that has file scope without an initializer, and without a storage-class specifier **or** with the storage-class specifier **static**, constitutes a tentative definition." ... and it makes readability and maintenance easier when the content of the static variable (the data) is at the end of the file. Besides, the data shouldn't pollute the global namespace since it's used only a specific place. – Myst Mar 11 '19 at 23:29
  • [This question](https://stackoverflow.com/questions/52067353/why-is-this-statement-producing-a-linker-error-with-gcc/52169796), particularly @lundin's accepted answer, is surely relevant, if not quite a duplicate. – rici Mar 12 '19 at 04:06
  • @rici , indeed the discussion in that answer, and the question’s comments, appears relevant. However, that question asks about the linker and assumes the code shouldn’t work while this question asks about the code’s correct form (how to make it work). – Myst Mar 12 '19 at 08:09
  • @myst: yes, that's why I didn't mark it as a duplicate. Since the construct is not allowed by the standard, the only way to make it work is to accept compiler extensions, which most compilers implement. – rici Mar 12 '19 at 14:02
  • I get that error with gcc if and only if I compile with `-pedantic`. If you see the same thing, please add that information to your question. – Keith Thompson Mar 12 '19 at 20:31

3 Answers3

3

The C11 draft has this in §6.9.2 External object definitions:

3 If the declaration of an identifier for an object is a tentative definition and has internal linkage, the declared type shall not be an incomplete type

I read this as saying that the first line in your code, which has an array of unspecified length, fails to be a proper tentative definition. Not sure what it becomes then, but that would kind of explain GCC's first message.

unwind
  • 391,730
  • 64
  • 469
  • 606
  • I think that *incomplete type* refers to forward declared classes and structs. But of course, it could also encompass array types without a specified length. – RmbRT Mar 11 '19 at 21:18
  • 1
    The first declaration might escape this “shall not” because, by the previous paragraph, the behavior is “exactly as if the translation unit contains a file scope declaration of that identifier, with the composite type as of the end of the translation unit, with an initializer equal to 0.” Since, at the end of the translation unit, the composite type is complete, the behavior of the first declaration is exactly as if it had that type. – Eric Postpischil Mar 11 '19 at 22:13
  • 3
    @RmbRT - see 6.9.2 section 5 (example 2), where `int i[]` is given as a valid example that resolves (unless otherwise initialized) to an array with a single element initialized to 0. – Myst Mar 11 '19 at 22:20
  • 3
    @RmbRT: Per C 2018 6.2.5 1, an object type is incomplete if it is “lacking sufficient information to determine the size of objects of that type.” – Eric Postpischil Mar 11 '19 at 23:00
  • 1
    @RmbRT `[]` is *definitely* an incomplete type. – Antti Haapala -- Слава Україні Mar 12 '19 at 00:17
2

TL;DR

The short answer is that this particular construct is not allowed by the C11 standard -- or any other C standard going back to ANSI C (1989) -- but it is accepted as a compiler extension by many, though not all, modern C compilers. In the particular case of GCC, you need to not use -pedantic (or -pedantic-errors), which would cause a strict interpretation of the C standard. (Another workaround is described below.)

Note: Although you can spell -pedantic with a W, it is not like many -W options, in that it does not only add warning messages: What it does is:

Issue all the warnings demanded by strict ISO C and ISO C++; reject all programs that use forbidden extensions, and some other programs that do not follow ISO C and ISO C++.

Workarounds

It does not appear to be possible to suppress this error using a GCC #pragma, or at least the ones that I tried didn't have any effect. It is possible to suppress it for a single declaration using the __extension__ extension, but that seems to just be trading one incompatibility for another, since you would then need to find a way to remove (or macro expand away) __extension__ for other compilers.

Quoting the GCC manual:

-pedantic and other options cause warnings for many GNU C extensions. You can prevent such warnings within one expression by writing __extension__ before the expression. __extension__ has no effect aside from this.

On the GCC versions I had handy, the following worked without warnings even with -pedantic:

__extension__ static struct example_s { int i; } example[];

Probably your best bet it to just remove -pedantic from the build options. I don't believe that -pedantic is actually that useful; it's worth reading what the GCC manual has to say about it. In any event, it is doing its job here: the documented intent is to ban extensions, and that's what it is doing.

Language-lawyering

The language-lawyer justification for the above, taking into account some of the lengthy comment threads:

Definitions

  1. An external declaration is a declaration at file scope, outside of any function definition. This shouldn't be confused with external linkage, which is a completely different usage of the word. The standard calls external declarations "external" precisely because they are outside any function definitions.

    A translation unit is, thus, a sequence of external-declaration. See §6.9.

    If an external declaration is also a definition -- that is, it is either a function declaration with a body or an object declaration with an initializer -- then it is referred to as an external definition.

  2. A type is incomplete at a point in a program where there is not "sufficient information to determine the size of objects of that type" (§6.2.5p1), which includes "an array type of unknown size" (§6.2.5p22). (I'll return to this paragraph later.) (There are other ways for a type to be incomplete, but they're not relevant here.)

  3. An external declaration of an object is a tentative definition (§6.9.2) if it is not a definition and is either marked static or has no storage-class specifier. (In other words, extern declarations are not tentative.)

    What's interesting about tentative definitions is that they might become definitions. Multiple declarations can be combined with a single definition, and you can also have multiple declarations (in a translation unit) without any definition (in that translation unit) provided that the symbol has external linkage and that there is a definition in some other translation unit. But in the specific case where there is no definition and all declarations of a symbol are tentative, then the compiler will automatically insert a definition.

    In short, if a symbol has any (external) declaration with an explicit extern, it cannot qualify for automatic definition (since the explicitly-marked declaration is not tentative).

A brief detour: the importance of the linkage of the first declaration

Another curious feature: if the first declaration for an object is not explicitly marked static, then no declaration for that object can be marked static, because a declaration without a storage class is considered to have external linkage unless the identifier has already been declared to have internal linkage (§6.2.2p5), and an identifier cannot be declared to have internal linkage if it has already been declared to have external linkage (§6.2.2p7). However, if the first declaration for an object is explicitly static, then subsequent declarations have no effect on its linkage. (§6.2.2p4).

What this all meant for early implementers

Suppose you're writing a compiler on an extremely resource-limited CPU (by modern standards), which was basically the case for all early compiler writers. When you see an external declaration for a symbol, you need to either give it an address within the current translation unit (for symbols with internal linkage) or you need to add it to the list of symbols you're going to let the linker handle (for symbols with external linkage). Since the linker will assign addresses to external symbols, you don't yet need to know what their size is. But for the symbols you're going to handle yourself, you will want to immediately give them an address (within the data segment) so that you can generate machine code referencing the data, and that means that you do need to know what size these objects are.

As noted above, you can tell whether a symbol is internally or externally linked when you first see a declaration for it, and it must be declared before it is used. So by the time you need to emit code using the symbol, you can know whether to emit code referencing a specific known offset within the data segment, or to emit a relocatable reference which will be filled in later by the linker.

But there's a small problem: What if the first declaration is incomplete? That's not a problem for externally linked symbols, but for internally-linked symbols it prevents you from allocating it to an address range since you don't know how big it is. And by the time you find out, you might have had to have emitted code using it. To avoid this problem, it's necessary that the first declaration of an internally-linked symbol be complete. In other words, there cannot be a tentative declaration of an incomplete symbol, which is what the standard says in §6.9.2p3:

If the declaration of an identifier for an object is a tentative definition and has internal linkage, the declared type shall not be an incomplete type.

A bit of paleocybernetics

That's not a new requirement. It was present, with precisely the same wording, in §3.7.2 of C89. And the issue has come up several times over the years in the comp.lang.c and comp.std.c Usenix groups, without ever attracting a definitive explanation. The one I provided above is my best guess, combined with hints from the following discussions:

And it's also come up a few times on Stackoverflow:

A final doubt

Although no-one in any of the above debates has mentioned it, the actual wording of §6.2.5p22 is:

An array type of unknown size is an incomplete type. It is completed, for an identifier of that type, by specifying the size in a later declaration (with internal or external linkage).

That definitely seems to contradict §6.9.2p3, since it contemplates a "later declaration with interal linkage", which would not be allowed by the prohibition on tentative definitions with internal linkage and incomplete type. This wording is also contained word-for-word in C89 (in §3.1.2.5), so if this is an internal contradiction, it's been in the standard for 30 years, and I was unable to find a Defect Report mentioning it (although DR010 and DR016 hover around the edges).

Note:

For C89, I relied on this file saved in the Wayback Machine but I have no proof that it's correct. (There are other instances of this file in the archive, so there is some corroboration.) When the ISO actually released C90, the sections were renumbered. See this information bulletin, courtesy wikipedia.

rici
  • 234,347
  • 28
  • 237
  • 341
  • I think you summed up most of the information and the comments so far. That's some amazing historical research, and I love the final doubt you mentioned. However, I don't buy the interpretation given nor the "single-pass" theory, since a `static int a[]` could easily be resolved in a similar way to an `int a[]` (assuming both are defined later on within the same translation unit). I believe that if the original rational was written down then we would have seen that the interpretation everyone assumed was mistaken... (1/2, cont...) – Myst Mar 13 '19 at 11:38
  • (cont...2/2) I truly believe the correct interpretation of §6.9.2p3 should be as a description regarding the result of the C code rather than a constraint limiting the allowed code. – Myst Mar 13 '19 at 11:41
  • @Myst: You are, of course, entitled to believe whatever you want. But I'd like you to consider that in all of those debates on `comp.std.c`, which was a key place for C standard discussions and was read by many people actually on the standards committee, not one person -- including C experts, including committee members, including people who really believed the restriction was a mistake -- not one person suggested your interpretation. Reading those debates (which I did), there is a consensus that the language of 6.9.2p3 is exactly what its words imply... – rici Mar 13 '19 at 13:45
  • It's true that this interpretation wasn't mentioned anywhere in those discussions, which is part of the reason I both upvoted and accepted your answer regardless of not mentioning it. I guess placing it in the comments will have to be enough ;-) – Myst Mar 13 '19 at 13:48
  • As for the single pass compiler theory, it is just a theory but I'm pretty confident in it. You cannot resolve the size of an incomplete array *until the end of the translation unit*, and I think my argument shows why that is important for a `static` variable and not for a global. – rici Mar 13 '19 at 13:49
  • The single-pass theory is a good one... except, technically, it should be possible to resolve the size of a `static` incomplete array in a single pass compiler by reducing the array reference to a pointer and assigning the data space at the end of the pass (or when a new tentative declaration provides the details). Then again, I didn't get to author a compiler yet, so maybe I don't really understand this properly. – Myst Mar 13 '19 at 13:53
  • @myst: you could do that, but it would add overhead, and there was a lot of resistance to enforced overhead (even now). In those historical emails, I should maybe have quoted the first one; Doug Gwyn was a member of the ANSI C standardisation committee X3J11, so when he says "we had agreed to that", it's a pretty good indication that the committee did in fact explicitly agree to that. It's unfortunate that no-one took up his suggestion that the rationale be documented. – rici Mar 13 '19 at 15:32
  • As Doug said, “ok, I stand corrected”... though as one can read, Doug did so after the `gcc` behavior was reviewed and not because his initial interpretation of the standard was different then mine. Still, I’ll concede the point. – Myst Mar 13 '19 at 15:41
1

Edit: Apparently gcc was throwing an error due to the -Wpedantic flag, which (for some obscure reason) added errors in addition to warnings (see: godbolt.org and remove the flag to compile).

¯\_(ツ)_/¯ 

A possible (though not DRY) answer is to add the array length to the initial declaration (making a complete type with a tentative declaration where C11 is concerned)... i.e.:

static struct example_s { int i; } example[3];

int main(void) {
  fprintf(stderr, "Number: %d\n", example[0].i);
  return 0;
}

static struct example_s example[3] = {{1}, {2}, {3}};

This is super annoying, as it introduces maintenance issues, but it's a temporary solution that works.

Myst
  • 18,516
  • 2
  • 45
  • 67
  • It is unlikely using `-pedantic` caused GCC to generally add “errors in addition to warnings.” Rather, per [its documentation](https://gcc.gnu.org/onlinedocs/gcc/Warning-Options.html), `-pedantic` both causes GCC to issue all warnings required by strict ISO C **and** “reject all programs that use forbidden extensions.” The GCC developers may consider accepting an incomplete tentative definition as an extension, so `-pedantic` causes this extension to be rejected, thus resulting in an error. – Eric Postpischil Mar 11 '19 at 23:11
  • @EricPostpischil - although technically possible, the complete command line included the `-std=c11` and it would compile only if the `-Wpedantic` was removed. As for "an incomplete tentative definition", I think this isn't the correct way to read the standard. See section 6.9.2.5, where an `int i[]` is considered a valid tentative definition. – Myst Mar 11 '19 at 23:15
  • `int i[];` is a valid tentative definition, but it does not have internal linkage, so 6.9.2 3 does not apply to it. `static … example[];` does have internal linkage and is subject to 6.9.2 3. – Eric Postpischil Mar 11 '19 at 23:39
  • @EricPostpischil - the point in 6.9.2.5 is that `int i[]` is **not** an incomplete type. The in-memory size of the type is known (defaults to a single element with a value of zero, unless later initialization is different). – Myst Mar 11 '19 at 23:42
  • 1
    `int i[];` does have incomplete type, and the example says so. It is just that 6.9.2 2 then causes it to be completed with a size of 1. But `static int i[];` cannot be completed in the same way, because 6.9.2 3 says it shall not have incomplete type. GCC’s interpretation of the standard appears to be that 6.9.2 3 applies before the completion specified by 6.9.2 2 is performed. – Eric Postpischil Mar 11 '19 at 23:57
  • @EricPostpischil , I guess we'll have to disagree (although GCC might side with your POV). As far as I can read this, 6.9.2.3 discusses the **result**, not a coding constraint. **The wording implies that the declaration is possible and dictates behavior when it occurs**... IMHO, it specifies that the **resulting** declaration will not be an incomplete type (such as would be allowed for external linkage), rather the **result** would be a complete type dictated by the implicit initializer - which is also why 6.9.2.3 comes **after** 6.9.2.2 and how the example fits in. – Myst Mar 12 '19 at 00:06
  • @EricPostpischil - also note that the Clang compiler accepts the code even with the `-Wpedantic` flag, indicating that I'm not alone in my interpretation of section 6.9.2.3. – Myst Mar 12 '19 at 00:10
  • @Myst: If 6.9.2 3 applies only to the resulting type after 6.9.2 2 is applied, what effect could 6.9.2 3 ever have? Can you show any code in which some `static … foo…;` is rejected because 6.9.2 3 says “shall not” for internal linkage, but the same `… foo…;` without `static` is accepted? – Eric Postpischil Mar 12 '19 at 00:19
  • @EricPostpischil - the effect it has is on the value of the `static` variable vs. the alternative (an `extern` variable). This is the context for section 6.9.2.3. If an `extern` variable is declared within a file scope and it is incomplete, then it will **not** get initialized and it will remain as an incomplete type. However, internal linkage types must be complete (since memory needs to be allocated within the scope) and 6.9.2.3 instructs the compiler to complete them. I think it's quite simple. – Myst Mar 12 '19 at 00:27
  • @AnttiHaapala - and now I found a bug in GCC's interpretation... ;-) – Myst Mar 12 '19 at 00:28
  • @Myst pretty sure it is Clang again. This was the one: https://godbolt.org/z/dE8U7R – Antti Haapala -- Слава Україні Mar 12 '19 at 00:28
  • @AnttiHaapala - your godbolt points at an MSVC compiler issue. The code compiles on Clang, GCC and Intel (icc). – Myst Mar 12 '19 at 00:32
  • Doesn't compile with [MSVC](https://godbolt.org/z/tg6o6Z) which is usually a good signal. They've read the standards like the devil the Bible... – Antti Haapala -- Слава Україні Mar 12 '19 at 00:32
  • @Myst check the pedantic output on Clang, GCC... The error in that other example is a certain constraint violation and diagnostics must be issued. MSVC refuses to compile because they don't ever implement any useful extensions. – Antti Haapala -- Слава Україні Mar 12 '19 at 00:32
  • @AnttiHaapala - compiles on icc, compiles on Clang (no warning with `-Wpedantic`)... errors / warnings only show on MSVC and GCC. It's 2:2. I really think I'm reading 6.9.2.3 correctly. It both makes sense as far as the language goes and as far as the wording implies. Holistically, it allows variable and function declarations to behave in a consistent way, whereas claiming that `static` arrays can't be declared and must be defined (unlike non-static arrays) seems weird. – Myst Mar 12 '19 at 00:37
  • @Myst: An identifier declared with `extern` remains incomplete because it is not a tentative definition. A declaration with no initializer and without either `extern` or `static` will be a tentative definition per 6.9.2 2, and it will be completed per 6.9.2 2. So, can you show a tentative definition of some `x` that is accepted while the same tentative definition prefixed with `static` would be rejected because of 6.9.2 3? – Eric Postpischil Mar 12 '19 at 00:38
  • @EricPostpischil - your question and proposed test excludes my interpretation, since you seem to assume that the result of 6.9.2.3 is a coding error while I assume the result is a type directive. I could probably point out that it's been a long standing practice to have missing identifiers (no `static` and no `extern`) result in `weak` external symbols (J.5.11)... see also 6.2.2.5: " the declaration of an identifier for an object has file scope and no storage-class specifier, its linkage is external", so a missing `static` is practically the same as `extern` (except as I noted). – Myst Mar 12 '19 at 00:49
  • What is a “type directive”? – Eric Postpischil Mar 12 '19 at 01:05
  • @EricPostpischil - "type directive", or "compiler directive regarding resulting object/type" or me really not knowing any good and official names (it's really late here)... whatever we call it, it's a paragraph (or directive) aimed at defining the resulting machine code / object, not a paragraph aimed at defining a language constraint (a coding error). – Myst Mar 12 '19 at 01:09