0

I have these dummy piece of software made of 3 files:

test.h

int gv;
void set(int v);

test.c

#include "test.h"

void set(int x) {
    gv = x;
}

main.c

#include "test.h"
#include <assert.h>

int main() {
    set(1);
    assert(gv == 1);
}

The code compiles and run fine in both MSVC 2019 and GCC 8, but with clang (clang-cl 11 supplied by Visual Studio 2019) fails at link time complaining about gv already defined:

1>------ Build started: Project: test, Configuration: Debug x64 ------
1>lld-link : error : undefined symbol: gv
1>>>> referenced by ...\test\main.c:6
1>>>>               x64\Debug\main.obj:(main)
1>>>> referenced by ...\test\test.c:4
1>>>>               x64\Debug\test.obj:(set)
1>Done building project "test.vcxproj" -- FAILED.

I understand that extern is the default storage-class specifier for objects defined at file scope, but if I explicitly specify extern to int gv, it breaks the linkage with every compiler (unless I add a definition for gv in a source file, of course).

There is something that I do not understand. What is happening?

Giovanni Cerretani
  • 1,693
  • 1
  • 16
  • 30
  • `omplaining about gv already define` Please post the exact full compiler error messages, including all notices, line numbers and filenames. Please post the compiler options you are using and compiler versions. You want to research common symbols, like https://stackoverflow.com/a/15604964/9072753 . – KamilCuk Jun 24 '21 at 12:43
  • Difference in the way the compilers are scoping `gv`. (i.e. its visibility across files.) If you declare `gv` to be `extern`, i.e. `extern int gv;`. Then define `gv` in `main()`, i.e. `int gv = 0;`, then it will likely work the same way in both compilers. – ryyker Jun 24 '21 at 12:43
  • According to @KamilCuk link, it seems just a UB case, that works in some platforms because of some compiler extensions. – Giovanni Cerretani Jun 24 '21 at 13:10

2 Answers2

1

int gv; is a tentative definition of gv, per C 2018 6.9.2 2. When there is no regular definition in a translation unit (the file being compiled along with everything it includes), a tentative definition becomes a definition with an initializer of zero.

Because this tentative definition is included in both test.c and main.c, there are tentative definitions in both test.c and main.c. When these are linked together, your program has two definitions.

The C standard does not define the behavior when there are two definitions of the same identifier with external linkage. (Having two definitions violates the “shall” requirement in C 2018 6.9 5, and the standard does not define the behavior when the requirement is violated.) For historic reasons, some compilers and linkers have treated tentative definitions as “common symbol” definitions that would be coalesced by the linker—having multiple tentative definitions of the same symbol would be resolved to a single definition. And some do not; some treat tentative definitions more as regular definitions, and the linker complains if there are multiple definitions. This is why you are seeing a difference between different compilers.

To resolve the issue, you can change int gv; in test.h to extern int gv;, which makes it a declaration that is not a definition (not even a tentative definition). Then you should put int gv; or int gv = 0; in test.c to provide one definition for the program. Another solution could be to use the -fcommon switch, per below.

The default behavior changed in GCC version 10 (and possibly Clang at some point; my Apple Clang 11 behaves differently from your report). With GCC and Clang, you can select the desired behavior with the command-line switch -fcommon (to treat tentative definitions as common symbols) or -fno-common (to cause a linker error if there are multiple tentative definitions).

Some additional information is here and here.

Eric Postpischil
  • 195,579
  • 13
  • 168
  • 312
0

I understand that extern is the default storage-class specifier for objects defined at file scope

That's true but the linkage breaks because of "redefinition" of the gv symbol, isn't it?

That's because both test.c and main.c have int gv; after the preprocessor includes the headers. Thus eventually both objects test.o and main.o contain _gv symbol.

The most common solution is to have extern int gv; in the test.h header file (which tells the compiler that gv storage is allocated somewhere else). And inside the C file, main.c for example, define int gv; so that the storage for gv will be actually allocated but only once, inside main.o object.


EDIT:

Referring the same link you provided storage-class specifier, which contains the following statement:

Declarations with external linkage are commonly made available in header files so that all translation units that #include the file may refer to the same identifier that are defined elsewhere.

Alex Lop.
  • 6,810
  • 1
  • 26
  • 45