10

I realize that, at first sight, my question might seem an obvious duplicate of one of the many questions here related with the extern keyword, but I was unable to find any answer talking about the difference between extern "C" and extern "C" { }. On the contrary, I've found several people stating that the two constructs are equivalent, as I believe it is reasonable to expect. Unfortunately, empirical evidence shows that they really are not equivalent.

Here is an example:

extern "C" { const int my_var1 = 21; }
extern "C" const int my_var2 = 42;
const int my_var3 = 121;

int main() { }

After compiling it with gcc 7, with g++ externC.cpp, I see a remarkable difference:

$ readelf -s ./a.out | grep my_var
    34: 0000000000000694     4 OBJECT  LOCAL  DEFAULT   15 _ZL7my_var1
    35: 000000000000069c     4 OBJECT  LOCAL  DEFAULT   15 _ZL7my_var3
    59: 0000000000000698     4 OBJECT  GLOBAL DEFAULT   15 my_var2

my_var1 and my_var3 both have local binding and a C++ mangled name, while my_var2 has global binding and actual C linkage. So, it looks like the extern "C" { } has been completely ignored, while the similar extern "C" without {} did have effect. That is super weird to me.

Things get even more interesting if I remove the const and just try to read the variables:

#include <cstdio>

extern "C" { int my_var1; }
extern "C" int my_var2;
int my_var3;

int main() {
    printf("%d, %d, %d\n", my_var1, my_var2, my_var3);
}

When I try to compile this 2nd program, the linker complains that it has been unable to find a reference for my_var2:

/tmp/ccfs9cis.o: In function `main':
externC.cpp:(.text+0xc): undefined reference to `my_var2'
collect2: error: ld returned 1 exit status

And that means that in this case two things happened:

  1. extern "C" { int my_var1; } instantiated in the translation unit a variable called my_var1 with C linkage.

  2. extern "C" int my_var2; declared an extern variable, where with extern I mean in the traditional sense (like extern int x;), but with "C" linkage.

Which, from, my point of view, is inconsistent with the behavior in the 1st case above, using const. In other words:

  • In the 1st program with const

    • extern "C" behaved like I expected extern "C" {} to behave [change the linkage]

    • extern "C" {} instead, did nothing

  • In the 2nd program, without const:

    • extern "C" {} behaved like I originally expected [change the linkage] BUT

    • extern "C" behaved like: extern "C" { extern int my_var2; } which is the way to declare an extern variable with C linkage (and unfortunately in C++ the keyword extern has been reused).

In conclusion, my question is: can anyone (maybe a compiler expert?) explain the theory behind the reason for extern "C" and extern "C" {} to behave so differently and in such a inconsistent (at least for me) way ? In mine experience with C++, I realized that once you understand in deep details a given concept, even its tricky and complex corner cases start to look pretty reasonable and consistent. Just, you need to see the whole picture very clearly. I believe that is such a case.

Thanks a lot to everybody, in advance.


Edit[1]

[At the end it turned that a similar question did exist here, just I was unable to find it. Sorry for that.]

Thanks to the answers so far, I understand now the subtle difference between extern "C" {} and extern "C", even if I'd still be curious to understand how we (the C++ developers/ISO committee) ended up with such a solution. It's kind-of like making if (x) foo(); to be behave slightly differently than if (x) { foo(); }. Anyway, given this new knowledge, I'd have a few (hopefully) interesting observations to make:

Given that the transformation: extern "C" X => extern "C" { extern X } is always correct

It follows that:

  • The only way to define (instantiate) a const variable with C linkage in the current translation unit is to make it extern, even if we want don't want that: the compiler will decide if we're instantiating or just declaring an extern depending on if we initialized the variable with a value: in that case, we're defining, otherwise we're just declaring.

  • The same logic (extern + const) applies to regular const variables with C++ linkage as well. A const variable with C linkage is no different except for the lack of name mangling.

  • From the statements above it follows that, since const implies internal linkage in C++ (but not in C!), the extern when used for a const does not mean extern, but just less internal or more extern than static.

In other words:

  • const int var = 23; creates a global variable with internal linkage, like static int var = 23; would except for being placed in a read-only segment.
  • extern const int var = 23; creates a global variable with regular (external) linkage. The extern neutralizes the implicit static. The result is the same as int var = 23 except that with const it will be placed in a read-only segment.
  • extern const int var; declares a proper extern variable in a foreign read-only segment.
vvaltchev
  • 609
  • 3
  • 20
  • 2
    `extern "C" BLAH` acts the same as `extern "C" { extern BLAH }` – Ben Voigt Dec 30 '18 at 18:36
  • 1
    What Ben said. Or another way of putting it: `extern "C" { BLAH }` changes only the language linkage of (some of) the declarations inside, but the single declaration version `extern "C" BLAH` both changes the language linkage AND applies all the usual effects of the plain `extern` keyword. – aschepler Dec 30 '18 at 18:52

1 Answers1

8

See here:

[extern "C" { ... }] Applies the language specification string-literal to all function types, function names with external linkage and variables with external linkage declared in declaration-seq.

Since const int my_var1 = 21; has internal linkage, wrapping extern "C" { } around it has no effect.

Also:

[extern "C" ...] Applies the language specification string-literal to a single declaration or definition.

and

A declaration directly contained in a language linkage specification is treated as if it contains the extern specifier for the purpose of determining the linkage of the declared name and whether it is a definition.

extern "C" int x; // a declaration and not a definition
// The above line is equivalent to extern "C" { extern int x; }

extern "C" { int x; } // a declaration and definition

This explains why for extern "C" const int my_var2 = 42; the variable has external linkage and an unmangled name. It also explains why you're seeing an undefined reference to my_var2 in your second code example.

Kevin
  • 6,993
  • 1
  • 15
  • 24
  • 1
    To quote the same source: "Every function type, every function name with external linkage, and **every variable name with external linkage**, has a property called language linkage. Language linkage encapsulates the set of requirements necessary to link with a module written in another programming language: calling convention, name mangling algorithm, etc." – Mike Lui Dec 30 '18 at 18:42