1

I'm reading the C Standard N1570 and faced some misunderstanding about linkage. As specified in 6.2.2. Linkages of objects:

5 If the declaration of an identifier for a function has no storage-class specifier, its linkage is determined exactly as if it were declared with the storage-class specifier extern. If the declaration of an identifier for an object has file scope and no storage-class specifier, its linkage is external.

So I guessed that there is no difference between extern and no storage-class specifier in the declaration of identifiers of objects with file scope.

Let's condider the following example:

test.h:

#ifndef _TEST_H
#define _TEST_H

int a;

void increment();

#endif //_TEST_H

test.c:

#include "test.h"

void increment(){
    a += 2;
}

main.c:

#include <stdio.h>
#include "test.h"

int main(int argc, char const *argv[])
{
    increment();
    printf("a = %d\n", a);
}

Since a is declared to have external linkage (file scope, no storage class specifier) a = 2 is printed as expected.

So I replaced the declaration of a to have extern specifier and expected no difference (according to the 6.2.2#5 I cited above):

test.h:

#ifndef _TEST_H
#define _TEST_H

extern int a; // <---- Note extern here

void increment();

#endif //_TEST_H

But now the linker complains:

CMakeFiles/bin.dir/main.c.o: In function `main':
main.c:37: undefined reference to `a'
liblibtest.a(test.c.o): In function `increment':
test.c:4: undefined reference to `a'
test.c:4: undefined reference to `a'
collect2: error: ld returned 1 exit status

How does the Standard explain this behavior? Since identifiers have the same linkage in both cases I expected the linker behavior to be the same too.

Sourav Ghosh
  • 133,132
  • 16
  • 183
  • 261
Some Name
  • 8,555
  • 5
  • 27
  • 77
  • 1
    Do not omit the `extern` from variables declared in headers. If you do, every file that includes the header ends up defining the variable. See [How do I use `extern` to share variables between source files?](https://stackoverflow.com/questions/1433204) for the full story. – Jonathan Leffler Nov 24 '18 at 06:17

1 Answers1

2

In the fist case int a, is a tentative definition.

In second case, a definition for a is missing, only declaration is there. That's why linker complains.

Quoting C11, chapter §6.9.2

A declaration of an identifier for an object that has file scope without an initializer, and without a storage-class specifier or with the storage-class specifier static, constitutes a tentative definition. If a translation unit contains one or more tentative definitions for an identifier, and the translation unit contains no external definition for that identifier, then the behavior is exactly as if the translation unit contains a file scope declaration of that identifier, with the composite type as of the end of the translation unit, with an initializer equal to 0.

Sourav Ghosh
  • 133,132
  • 16
  • 183
  • 261
  • So is the tentative definition treated by linker as a regular definition? The term _tentative_ is little bit confusing in this context. – Some Name Nov 24 '18 at 05:28
  • @SomeName Why confusing? I feel, the mention of _" and the translation unit contains no external definition for that identifier,"_ perfectly justifies the "tentative", dont you think? – Sourav Ghosh Nov 24 '18 at 05:30
  • 1
    It also hinges on the common extension noted in C11 [Annex J.5.11 Multiple external definitions](http://port70.net/~nsz/c/c11/n1570.html#J.5.11): _There may be more than one external definition for the identifier of an object, with or without the explicit use of the keyword extern; if the definitions disagree, or more than one is initialized, the behavior is undefined (6.9.2)._ This is strictly an extension to the standard, albeit one that is probably more frequently encountered than not. – Jonathan Leffler Nov 24 '18 at 06:21
  • 3
    The reason that `int a;` at the top of the file is tentative is that you're allowed to have `int a = 99;` further down the same source file. This is not tentative and completes the previous tentative definition. – Jonathan Leffler Nov 24 '18 at 06:24
  • @JonathanLeffler _or more than one is initialized_ But I expected that linker will complain in this case since more then one definition, no? – Some Name Nov 24 '18 at 10:20
  • @JonathanLeffler Could you please give an example of such UB? – Some Name Nov 24 '18 at 10:23
  • 2
    @SomeName: It depends on whether your linker allows the 'common extension' or not. Some linkers won't allow it at all; others allow it by default. There are options to C++ compilers to override the ODR — One Definition Rule — that the standard officially requires. The business with `int a;` in a header is not reliably portable because the standard says it does not have to work (in fact, the standard says it shouldn't work, but J.5.11 is a 'get out of jail free' card that let's you off the hook on many systems). – Jonathan Leffler Nov 24 '18 at 10:24
  • 2
    @SomeName: An example of UB is: `file1.c` has `int a;` at the top of the file, and `file2.c` has `double a;` at the top of the file. If those files are linked, the definitions disagree and the result is undefined behaviour — the standard doesn't mandate what will happen and anything is allowed. In practice, it usually means that the longer definition is used — that's the other part of the 'COMMON' heritage from Fortran (at least, Fortran 77 or earlier; I've not used the later versions), where COMMON has a specific peculiar meaning, and EQUIVALENCE is another interesting topic. – Jonathan Leffler Nov 24 '18 at 10:30