23

Why does gcc allow extern declarations of type void? Is this an extension or standard C? Are there acceptable uses for this?

I am guessing it is an extension, but I don't find it mentioned at:
http://gcc.gnu.org/onlinedocs/gcc-4.3.6/gcc/C-Extensions.html

$ cat extern_void.c
extern void foo; /* ok in gcc 4.3, not ok in Visual Studio 2008 */
void* get_foo_ptr(void) { return &foo; }

$ gcc -c extern_void.c # no compile error

$ gcc --version | head -n 1
gcc (Debian 4.3.2-1.1) 4.3.2

Defining foo as type void is of course a compile error:

$ gcc -c -Dextern= extern_void.c
extern_void.c:1: error: storage size of ‘foo’ isn’t known

For comparison, Visual Studio 2008 gives an error on the extern declaration:

$ cl /c extern_void.c 
Microsoft (R) 32-bit C/C++ Optimizing Compiler Version 15.00.21022.08 for 80x86
Copyright (C) Microsoft Corporation.  All rights reserved.

extern_void.c
extern_void.c(1) : error C2182: 'foo' : illegal use of type 'void'
David T. Pierson
  • 681
  • 4
  • 10
  • 1
    What's interesting is that even with `-std=c89 -pedantic` gcc is cool with this. – Dave Apr 06 '12 at 04:56
  • As I understand, defining a variable of type `void` is ill-formed, Applying `extern` on an incomplete type is not ill-formed though,For [example](http://ideone.com/v7VkF): `extern` on array with unknown size,which is defined later.However,**§6.2.5.19** says *"The void type comprises an empty set of values; it is an incomplete object type that cannot be completed."*, given that your code should be treated as a constraint violation.The fact that it compiles cleanly with `-pedantic` says that it is not an extension this is gcc bug or ambiguity in ways in which msvc and gcc interpret the standard. – Alok Save Apr 06 '12 at 05:10

4 Answers4

8

Strangely enough (or perhaps not so strangely...) it looks to me like gcc is correct to accept this.

If this was declared static instead of extern, then it would have internal linkage, and §6.9.2/3 would apply:

If the declaration of an identifier for an object is a tentative definition and has internal linkage, the declared type shall not be an incomplete type.

If it didn't specify any storage class (extern, in this case), then §6.7/7 would apply:

If an identifier for an object is declared with no linkage, the type for the object shall be complete by the end of its declarator, or by the end of its init-declarator if it has an initializer; in the case of function arguments (including in prototypes), it is the adjusted type (see 6.7.5.3) that is required to be complete.

I either of these cases, void would not work, because (§6.2.5/19):

The void type [...] is an incomplete type that cannot be completed.

None of those applies, however. That seems to leave only the requirements of §6.7.2/2, which seems to allow a declaration of a name with type void:

At least one type specifier shall be given in the declaration specifiers in each declaration, and in the specifier-qualifier list in each struct declaration and type name. Each list of type specifiers shall be one of the following sets (delimited by commas, when there is more than one set on a line); the type specifiers may occur in any order, possibly intermixed with the other declaration specifiers.

  • void
  • char
  • signed char

[ ... more types elided]

I'm not sure that's really intentional -- I suspect the void is really intended for things like derived types (e.g., pointer to void) or the return type from a function, but I can't find anything that directly specifies that restriction.

Jerry Coffin
  • 476,176
  • 80
  • 629
  • 1,111
  • My reading of §6.9.2 is exactly the opposite of yours: A declaration of an identifier for an object that has file scope without an initializer, and without a storage-class specifier or with the storage-class specifier static, constitutes a tentative definition. Doesn't that mean this is NOT a tentative definition and thus we an incomplete type is NOT allowed? Comeau, IMO, correctly complains. – dirkgently Apr 06 '12 at 05:34
  • @dirkgently: This is definitely not a tentative definition, which is defined as (§6.9.2/2): "A declaration of an identifier for an object that has file scope without an initializer, and without a storage-class specifier or with the storage-class specifier static, constitutes a *tentative definition*." I don't see anything that says an incomplete type isn't allowed just because it's not a tentative definition though. – Jerry Coffin Apr 06 '12 at 05:40
  • 1
    The ambiguity between gcc and msvc seems to be in interpretation of ***§6.2.5.19***, If `void` is an Incomplete type which can never be completed, then the argument boils down to should `extern` be allowed to be applied on an Incomplete type which can never be complete,I believe it leaves room for implementation interpretation.Either ways, there will be an error for every compiler, just whether error is emitted during compilation or linking is what can differ & seems probably an edge case which they wouldnt want too many hassles about. – Alok Save Apr 06 '12 at 06:20
  • @JerryCoffin: For initialization to be valid as per §6.7.9/3 The type of the entity to be initialized shall be an array of unknown size or a complete object type that is not a variable length array type. Now, the latter condition doesn't hold in this case. Which also explains why there is no definition as to how to initialize objects of type `void`. – dirkgently Apr 06 '12 at 07:23
  • Also, from §J.2 Undefined behavior, this bit: An identifier for an object is declared with no linkage and the type of the object is incomplete after its declarator, or after its init-declarator if it has an initializer. Related: Even if this was a tentative definition, this would be UB as per: `An identifier for an object with internal linkage and an incomplete type is declared with a tentative definition (6.9.2).` (from J.2 again). – dirkgently Apr 06 '12 at 07:37
  • @JerryCoffin: The code invokes UB even if the definition is allowable as per the language rules. See my answer. – dirkgently Apr 07 '12 at 12:07
  • @dirkgently: I agree that it has UB, but that doesn't change the fact that the standard says a conforming compiler should accept it. – Jerry Coffin Apr 07 '12 at 13:51
  • @JerryCoffin: As per §6.9/5, `foo` should be an object. `void` is not an object type. Also, see §6.7.2.3/4, footnote 129) `An incomplete type may only by used when the size of an object of that type is not needed. It is not needed, for example, when a typedef name is declared to be a specifier for a structure or union, or when a pointer to or a function returning a structure or union is being declared. (See incomplete types in 6.2.5.) The specification has to be complete before such a function is called or defined.` – dirkgently Apr 07 '12 at 14:26
  • Also, a `definition` of an object §6.7/5 `causes storage to be reserved for that object;` which cannot be met here for an incomplete object. – dirkgently Apr 07 '12 at 14:26
  • @dirkgently: Footnotes are not normative. This is a declaration, not a definition. – Jerry Coffin Apr 07 '12 at 14:37
  • @JerryCoffin: Yes, but footnotes do point to the spirit rather the words of the language. From J2.`An identifier with external linkage is used, but in the program there does not exist exactly one external definition for the identifier, [...].` I should've quoted this first and then written that such a requirement can't be met for `void` type. – dirkgently Apr 07 '12 at 14:44
7

I've found the only legitimate use for declaring

extern void foo;

is when foo is a link symbol (an external symbol defined by the linker) that denotes the address of an object of unspecified type.

This is actually useful because link symbols are often used to communicate the extent of memory; i.e. .text section start address, .text section length, etc.

As such, it is important for the code using these symbols to document their type by casting them to an appropriate value. For instance, if foo is actually the length of a memory region:

uint32_t textLen;

textLen = ( uint32_t )foo;

Or, if foo is the start address of that same memory region:

uint8_t *textStart;

textStart = ( uint8_t * )foo;

The only alternate way to reference a link symbol in "C" that I know of is to declare it as an external array:

extern uint8_t foo[];

I actually prefer the void declaration, as it makes it clear that the linker defined symbol has no intrinsic "type."

AGS
  • 14,288
  • 5
  • 52
  • 67
user2824114
  • 71
  • 1
  • 1
  • Do you have a code example where this method is used that you can link to? – Shafik Yaghmour Apr 10 '14 at 16:17
  • In the scenario where `foo` is an actual address, I think it should be declared as an address within the C code. The only time I would think `void` would be appropriate would be in cases where `foo` isn't really an address. IMHO, it would have been helpful for the Standard to specify that the meaning of `extern void` is Implementation Defined, with a proviso that implementations may document and impose whatever restrictions on the usage it sees fit (including outlawing it entirely). – supercat Jul 30 '15 at 18:17
  • I've encountered this use-case myself, and as soon as I saw the question I thought "I wonder if OP is dealing with linker symbols without realising it?" – AJM Dec 03 '21 at 14:11
1

GCC (also, LLVM C frontend) is definitely buggy. Both Comeau and MS seems to report errors though.

The OP's snippet has at least two definite UBs and one red-herring:

From N1570

[UB #1] Missing main in hosted environment:

J2. Undefined Behavior

[...] A program in a hosted environment does not define a function named main using one of the specified forms (5.1.2.2.1).

[UB #2] Even if we ignore the above there still remains the issue of taking the address of a void expression which is explicitly forbidden:

6.3.2.1 Lvalues, arrays, and function designators

1 An lvalue is an expression (with an object type other than void) that potentially designates an object;64)

and:

6.5.3.2 Address and indirection operators

Constraints

1T he operand of the unary & operator shall be either a function designator, the result of a [] or unary * operator, or an lvalue that designates an object that is not a bit-field and is not declared with the register storage-class specifier.

[Note: emphasis on lvalue mine] Also, there is a section in the standard specifically on void:

6.3.2.2 void

1 The (nonexistent) value of a void expression (an expression that has type void) shall not be used in any way, and implicit or explicit conversions (except to void) shall not be applied to such an expression.

A file-scope definition is a primary-expression (6.5). So, is taking the address of the object denoted by foo. BTW, the latter invokes UB. This is thus explicitly ruled out. What remains to be figured out is if removing the extern qualifier makes the above valid or not:

In our case the, for foo as per §6.2.2/5:

5 [...] If the declaration of an identifier for an object has file scope and no storage-class specifier, its linkage is external.

i.e. even if we left out the extern we'd still land in the same problem.

dirkgently
  • 108,024
  • 16
  • 131
  • 187
  • It is legal to take the address of an lvalue whose value is not usable (e.g. a local variable which has never been written to). If `foo` is declared as an external identifier of type `void`, the expression `&foo` will yield a legitimate value of type `void*`. C provides no means by which a link-time resolvable definition for `foo` could be created, but if another language allows the definition of such an identifier I see no useful purpose to forbidding C compilers from allowing C code to access its address. – supercat Jul 30 '15 at 18:12
1

One limitation of C's linker-interaction semantics is that it provides no mechanism for allowing numeric link-time constants. In some projects, it may be necessary for static initializers to include numeric values which are not available at compile time but will be available at link time. On some platforms, this may be accomplished by defining somewhere (e.g. in an assembly-language file) a label whose address, if cast to int, would yield the numeric value of interest. An extern definition can then be used within the C file to make the "address" of that thing available as a compile-time constant.

This approach is very much platform-specific (as would be anything using assembly language), but it makes possible some constructs that would be problematic otherwise. A somewhat nasty aspect of it is that if the label is defined in C as a type like unsigned char[], that will convey the impression that the address may be dereferenced or have arithmetic performed upon it. If a compiler will accept void foo;, then (int)&foo will convert the linker-assigned address for foo to an integer using the same pointer-to-integer semantics as would be applicable with any other `void*.

I don't think I've ever used void for that purpose (I've always used extern unsigned char[]) but would think void would be cleaner if something defined it as being a legitimate extension (nothing in the C standard requires that any ability exist anywhere to create a linker symbol which can be used as anything other than one specific non-void type; on platforms where no means would exist to create a linker identifier which a C program could define as extern void, there would be no need for compilers to allow such syntax).

supercat
  • 77,689
  • 9
  • 166
  • 211