I am still not sure which part yo have trouble with, so I'll explain both.
extern "C"
tells the compiler that the functions are C functions. The difference is mainly the way functions are named/identified internally in the two languages; C++ has the parameter types "mangled" into the function name which is used internally to look up the correct function (remember, overloaded functions are to the linker -- which is potentially C++ agnostic -- normal, differently named functions). The #ifdef
/#endif
pairs just skip the extern "C"
(and closing curly brace) if the compiler is a C compiler. That's necessary because extern "C"
is not part of the C language, somewhat paradoxically, and the compiler would emit an error.
I found an explanation of g++ and VC's name mangling in this paper (which is 8 years old, so details may have changed, but the general concept is well layed out).
I did a quick test with a cygwin gcc/g++. Consider the following file overloaded-funcs.c:
int f(float x){}
#ifdef __cplusplus
/////////////////////////////////////////////////
// these f overloads are visible only to C++ compilers.
// A C compiler would not accept two functions with the same
// name.
int f(int x){}
int f(void){}
void g(void){}
/////////////////////////////////////////////////
#endif
#ifdef __cplusplus
/////////////////////////////////////////////////
// this part is visible only to C++ compilers
extern "C" int h(float){}
/////////////////////////////////////////////////
#endif
#ifdef __cplusplus
///////////////////////////////////////////////////
// This part, including an opening brace,
// is only visible to C++ compilers
extern "C"
{ // everything in this block are treated as
// C declarations/definitions.
/////////////////////////////////////////////////
#endif
/////////////////////////////////////////////////
// this part is visible to both C and C++ compilers
int i(void){}
/////////////////////////////////////////////////
#ifdef __cplusplus
/////////////////////////////////////////////////
// This brace is again only visible to C++ compilers
// (which have seen the opening brace above as well).
// C compilers would be confused by a closing brace out of nowhere
// because they did not see the opening brace.
} // closes the extern "C" block
/////////////////////////////////////////////////
#endif
First I looked at the preprocessor output with gcc's -E option in order to understand what the actual compiler sees.
$ gcc -E overloaded-funcs.c
# 1 "overloaded-funcs.c"
# 1 "<built-in>"
# 1 "<command-line>"
# 1 "overloaded-funcs.c"
int f(float x){}
# 40 "overloaded-funcs.c"
int i(void){}
Lines starting with #
are ignored as far as the language goes. The lines representing real "code" are nicely syntax-highlighted. We can see that the preprocessor eliminates everything between the #ifdef __cplusplus
s and corresponding #endif
s from the input to the actual compiler.
Then I actually compiled the source file and inspected the resulting object file with the gnu program "nm" (which, according to its man page, "lists symbols from object files").
$ gcc -c -o ccompiled.o overloaded-funcs.c && nm --defined-only ccompiled.o
0000000000000000 b .bss
0000000000000000 d .data
0000000000000000 p .pdata
0000000000000000 r .rdata$zzz
0000000000000000 t .text
0000000000000000 r .xdata
0000000000000000 T f
000000000000000b T i
The right column lists the names which this object file contains. We are interested in the functions which are defined in it. The first column is the function address (offset), the letter in the second column indicates the "segment" where the symbol is found. Function definitions are in the "text" section. We clearly see the function names f and i. This is the name by which the linker or loader would find them.
Now I use a C++ compiler for preprocessing. The compiler defines the reserved word __cplusplus
which makes the lines visible that were #defined out for a C compiler:
$ g++ -E overloaded-funcs.c
# 1 "overloaded-funcs.c"
# 1 "<built-in>"
# 1 "<command-line>"
# 1 "overloaded-funcs.c"
int f(float x){}
int f(int x){}
int f(void){}
void g(void){}
# 20 "overloaded-funcs.c"
extern "C" int h(float){}
# 30 "overloaded-funcs.c"
extern "C"
{
# 40 "overloaded-funcs.c"
int i(void){}
# 50 "overloaded-funcs.c"
}
Again the relevant lines of code are nicely highlighted.
Then I compiled it as C++ and examined the names in the object file
$ g++ -c -o cppcompiled.o overloaded-funcs.c && nm --defined-only cppcompiled.o
0000000000000000 b .bss
0000000000000000 d .data
0000000000000000 p .pdata
0000000000000000 r .rdata$zzz
0000000000000000 t .text
0000000000000000 r .xdata
0000000000000000 T _Z1ff
000000000000000b T _Z1fi
0000000000000014 T _Z1fv
000000000000001a T _Z1gv
0000000000000020 T h
000000000000002b T i
The function names just got a lot more complex -- "mangled". We still can see the "actual" names f
and g
in the middle of the name, prefixed with _Z1. Clearly, the parameter types are just encoded as single letters v, i and f for void, int and float behind the "actual" name part.
Note that the return value is not part of the generated function name, which means that it will normally not be known to the linker (which often has no other information than the object file). That is consistent with the language rule that the return value is not considered for overload resolution (and it is not possible to have two identically named functions which only differ in the return value). As far as the linker goes the overloaded versions of f
are completely unrelated functions.
We can also see that the functions which were declared extern "C"
have their old C name (h
and i
). C code in another translation unit could declare these functions and use them, and the linker would find that symbol (i
or h
), resolve the dependency and add the function's code to the executable. Such C code could not, however, link to a function f
because as far as the linker can see no such function exists in our object file.
It is also clear that C++ is more type safe. If a C function is declared with the wrong parameter types the linker links happily against an implementation which expects totally different parameters. It cannot know what the function expects. In C++ the linker would simply not find an implementation with different parameters because their types are encoded in the mangled name.