-4

Can somebody help me to understand the following statement? Why before #endif is "{" not "#ifdef", this seems illogical,

  1. If you have a function implemented in C and want to call it from C++.

1.1). if you can modify C header files Typically the declarations in a C header file are surrounded with

#ifdef __cplusplus
  extern "C" { 
#endif 

   [... C declarations ...] 

#ifdef __cplusplus 
  } 
#endif 

to make it usable from C++.

Oliver Charlesworth
  • 267,707
  • 33
  • 569
  • 680
Jay
  • 113
  • 1
  • 11

3 Answers3

4

If __cplusplus has been defined, and therefore it is C++ code, then we want

extern "C" { 

and close it with

}

at the end. I hope I have decoded your message properly.

4pie0
  • 29,204
  • 9
  • 82
  • 118
0

I am still not sure which part yo have trouble with, so I'll explain both.

extern "C" tells the compiler that the functions are C functions. The difference is mainly the way functions are named/identified internally in the two languages; C++ has the parameter types "mangled" into the function name which is used internally to look up the correct function (remember, overloaded functions are to the linker -- which is potentially C++ agnostic -- normal, differently named functions). The #ifdef/#endif pairs just skip the extern "C" (and closing curly brace) if the compiler is a C compiler. That's necessary because extern "C" is not part of the C language, somewhat paradoxically, and the compiler would emit an error.

I found an explanation of g++ and VC's name mangling in this paper (which is 8 years old, so details may have changed, but the general concept is well layed out).

I did a quick test with a cygwin gcc/g++. Consider the following file overloaded-funcs.c:

int f(float x){}

#ifdef __cplusplus
/////////////////////////////////////////////////
// these f overloads are  visible only to C++ compilers.
// A C compiler would not accept two functions with the same
// name.
int f(int x){}
int f(void){}
void g(void){}
/////////////////////////////////////////////////
#endif



#ifdef __cplusplus
/////////////////////////////////////////////////
// this part is visible only to C++ compilers
extern "C" int h(float){} 
/////////////////////////////////////////////////
#endif



#ifdef __cplusplus
/////////////////////////////////////////////////// 
// This part, including an opening brace,
// is only visible to C++ compilers
extern "C"
{   // everything in this block are treated as 
    // C declarations/definitions.
/////////////////////////////////////////////////
#endif



/////////////////////////////////////////////////
// this part is visible to both C and C++ compilers
int i(void){}
/////////////////////////////////////////////////


#ifdef __cplusplus
/////////////////////////////////////////////////
// This brace is again only visible to C++ compilers 
// (which have seen the opening brace above as well).
// C compilers would be confused by a closing brace out of nowhere 
// because they did not see the opening brace.
} // closes the extern "C" block
/////////////////////////////////////////////////
#endif

First I looked at the preprocessor output with gcc's -E option in order to understand what the actual compiler sees.

$ gcc -E overloaded-funcs.c
# 1 "overloaded-funcs.c"
# 1 "<built-in>"
# 1 "<command-line>"
# 1 "overloaded-funcs.c"

int f(float x){}
# 40 "overloaded-funcs.c"
int i(void){}

Lines starting with # are ignored as far as the language goes. The lines representing real "code" are nicely syntax-highlighted. We can see that the preprocessor eliminates everything between the #ifdef __cpluspluss and corresponding #endifs from the input to the actual compiler.

Then I actually compiled the source file and inspected the resulting object file with the gnu program "nm" (which, according to its man page, "lists symbols from object files").

$ gcc -c -o ccompiled.o overloaded-funcs.c && nm --defined-only ccompiled.o
0000000000000000 b .bss
0000000000000000 d .data
0000000000000000 p .pdata
0000000000000000 r .rdata$zzz
0000000000000000 t .text
0000000000000000 r .xdata
0000000000000000 T f
000000000000000b T i

The right column lists the names which this object file contains. We are interested in the functions which are defined in it. The first column is the function address (offset), the letter in the second column indicates the "segment" where the symbol is found. Function definitions are in the "text" section. We clearly see the function names f and i. This is the name by which the linker or loader would find them.

Now I use a C++ compiler for preprocessing. The compiler defines the reserved word __cplusplus which makes the lines visible that were #defined out for a C compiler:

$ g++ -E overloaded-funcs.c
# 1 "overloaded-funcs.c"
# 1 "<built-in>"
# 1 "<command-line>"
# 1 "overloaded-funcs.c"

int f(float x){}

int f(int x){}
int f(void){}
void g(void){}
# 20 "overloaded-funcs.c"
extern "C" int h(float){}
# 30 "overloaded-funcs.c"
extern "C"
{
# 40 "overloaded-funcs.c"
int i(void){}
# 50 "overloaded-funcs.c"
}

Again the relevant lines of code are nicely highlighted.

Then I compiled it as C++ and examined the names in the object file

$ g++ -c -o cppcompiled.o overloaded-funcs.c && nm --defined-only cppcompiled.o
0000000000000000 b .bss
0000000000000000 d .data
0000000000000000 p .pdata
0000000000000000 r .rdata$zzz
0000000000000000 t .text
0000000000000000 r .xdata
0000000000000000 T _Z1ff
000000000000000b T _Z1fi
0000000000000014 T _Z1fv
000000000000001a T _Z1gv
0000000000000020 T h
000000000000002b T i

The function names just got a lot more complex -- "mangled". We still can see the "actual" names f and g in the middle of the name, prefixed with _Z1. Clearly, the parameter types are just encoded as single letters v, i and f for void, int and float behind the "actual" name part.

Note that the return value is not part of the generated function name, which means that it will normally not be known to the linker (which often has no other information than the object file). That is consistent with the language rule that the return value is not considered for overload resolution (and it is not possible to have two identically named functions which only differ in the return value). As far as the linker goes the overloaded versions of f are completely unrelated functions.

We can also see that the functions which were declared extern "C" have their old C name (h and i). C code in another translation unit could declare these functions and use them, and the linker would find that symbol (i or h), resolve the dependency and add the function's code to the executable. Such C code could not, however, link to a function f because as far as the linker can see no such function exists in our object file.

It is also clear that C++ is more type safe. If a C function is declared with the wrong parameter types the linker links happily against an implementation which expects totally different parameters. It cannot know what the function expects. In C++ the linker would simply not find an implementation with different parameters because their types are encoded in the mangled name.

Peter - Reinstate Monica
  • 15,048
  • 4
  • 37
  • 62
-1

extern simply tells us that the variable is defined elsewhere and not within the same block where it is used. Basically, the value is assigned to it in a different block and this can be overwritten/changed in a different block as well. So an extern variable is nothing but a global variable initialized with a legal value where it is declared in order to be used elsewhere. It can be accessed within any function/block. Also, a normal global variable can me made extern as well by placing the 'extern' keyword before its declaration/definition in any function/block. This basically signifies that we are not initializing a new variable but instead we are using/accessing the global variable only. The main purpose of using extern variables is that they can be accessed between two different files which are part of a large program.