0

If I have a project with the following 3 files in the same directory:

mylib.h:

int some_global;
void set_some_global(int value);

mylib.c:

#include "mylib.h"
void set_some_global(int value)
{
    some_global = value;
}

main.c:

#include <stdio.h>
#include "mylib.h"

int main()
{
    set_some_global(42);
    printf("Some global: %d\n", some_global);
    return 0;
}

and I compile with

gcc main.c mylib.c -o prog -Wall -Wpedantic

I get no errors or warnings, and the prog program prints 42 to the console.

When I first tried this, I expected there to be a "multiple definition" error or some kind of warning since some_global is not declared extern in the header file. Upon researching this issue, I discovered that in C the extern is implicit on variable declarations outside of functions (and also that the opposite is true for C++, which can be demonstrated by using g++ instead of gcc in the compilation line above). Also, if I change the line in mylib.h from a declaration to a definition (e.g. int some_global = 1;), I do get the "multiple definition" error that I expected (this is nothing shocking).

My main question is: where is the variable being defined? It appears to be implicitly defined somewhere, but at what point does either the compiler or linker realize it needs that variable defined and does so?

Also, why is it that if I explicitly declare the variable as extern in the mylib.h file, I get "undefined reference" errors unless I explicitly declare the variable in one and only one *.c? I would expect that given the reason why the code above works (that extern is implicit), that explicitly declaring extern wouldn't make a difference. Why is there a difference in behavior?


Follow up

After the answer below corrected me that the code in mylib.h is a "tentative definition" rather than a declaration, I discovered this related answer with more details on such matters:

https://stackoverflow.com/a/3095957/7007605

Billy
  • 5,179
  • 2
  • 27
  • 53
  • 1
    The variable should be defined in *both* translation units. I find it a little weird that you don't get a linker error from that. – Some programmer dude Oct 29 '21 at 18:41
  • @Someprogrammerdude: I agree - it's weird there's no linker error. Hence my question :). – Billy Oct 29 '21 at 18:42
  • And when you use `extern` you explicitly *declare* the variable, and never define it anywhere. – Some programmer dude Oct 29 '21 at 18:43
  • @Neil I know that static-storage-duration variables are always initialized, and extern (some say "global") variables always have static storage duration. I'm trying to find out *where* the variable is defined (I'm fairly certain it's only being *declared* in both translation units). – Billy Oct 29 '21 at 18:44
  • 1
    I can't replicate the non-error. If I copy-paste the code from all three files, and run the exact command you use, I get the expected multiple-definition error. Is the code you show truly copy-pasted (without any modifications) from your own [mre]? – Some programmer dude Oct 29 '21 at 18:46
  • @Someprogrammerdude I agree again - with `extern` I know that it's being declared in both translation units, but never defined, and hence the "undefined reference" error. *If* when excluding the `extern` keyword I'm defining rather than declaring the variable, I again don't get why there isn't a linker issue. In either case, both declaration units are making the same declaration or the same definition - I see no way for one to be a definition and the other to be a declaration, which is the only way I can think of it being legal. Hence my confusion. – Billy Oct 29 '21 at 18:48
  • @Someprogrammerdude this is exact copy paste. I even just copied back from this question into a new blank directory just to be sure, and I don't get an error. I'm using gcc version 9.3.0 (Ubuntu 9.3.0-17ubuntu1~20.04) – Billy Oct 29 '21 at 18:53
  • A definition is tentative only in a single translation unit. It's tentative until the compiler finds another definition in the translation unit, in which case the first (tentative) definition becomes a declaration, or the compiler have finished parsing the translation unit in which case the tentative definition becomes a concrete definition. – Some programmer dude Oct 30 '21 at 07:48

2 Answers2

2
  1. Your code compiles and links without error only because you use gcc which was compiled with -fcommon command line option "The -fcommon places uninitialized global variables in a common block. This allows the linker to resolve all tentative definitions of the same variable in different compilation units to the same object, or to a non-tentative definition. (...) It is mainly useful to enable legacy code to link without errors." This was default prior to version 10, but even now many toolchains are still build with this option enabled.

  2. Never define data in the header files. Place only extern definitions of the variables in the header files.

It should be:

extern int some_global;
void set_some_global(int value);

mylib.c:

#include "mylib.h"

int some_global;

void set_some_global(int value)
{
    some_global = value;
}

main.c:

#include <stdio.h>
#include "mylib.h"

int main()
{
    set_some_global(42);
    printf("Some global: %d\n", some_global);
    return 0;
}
0___________
  • 60,014
  • 4
  • 34
  • 74
  • Re “you use `gcc` which was compiled with `-fcommon`”: No, it was not. The question shows the `gcc` command line, and it does not contain `-fcommon`. This implies an old version of GCC was used, as `-fcommon` was the default prior to version 10. – Eric Postpischil Oct 29 '21 at 19:03
  • Thanks! I was unaware of the concept of "tentative definitions". This whole thing came about from me reviewing legacy code that followed the pattern in my question. I'm used to the convention you display in your answer and was about to change the legacy code to follow it, but I wanted to know how that legacy code had been working in the first place. – Billy Oct 29 '21 at 19:03
  • 1
    @EricPostpischil No. I meant that the gcc itself was build with this option (set in autoconf files). You can build any version of the GCC with default command-line options as per your liking. So you can have version 11.x build with `-fcommon` as default option. Most embedded target ports are build a bit differently from the "main" x86 linux version – 0___________ Oct 29 '21 at 19:05
  • @0___________: Less likely than using an old version. – Eric Postpischil Oct 29 '21 at 19:09
  • @EricPostpischil 100% likely. Older version by default, never not by default. But this option is used for sure. BTW I build my ARM toolchains myself – 0___________ Oct 29 '21 at 19:10
  • Sentence no verb no communicate. “Older version by default, never not by default” is incomprehensible. – Eric Postpischil Oct 29 '21 at 19:10
1

int some_global; is a tentative definition. In GCC before version 10, GCC produced an object file treating this as a common symbol. (This behavior is still selectable by a switch, -fcommon.) The linker coalesces multiple definitions of a common symbol to a single definition.

Eric Postpischil
  • 195,579
  • 13
  • 168
  • 312