I think the documentation you've read has mislead you by its laxity:
class MyClass;
doesn't exactly mean there is such a class, because the
only way to make a class exist is to define it, and a declaration is
not a definition. The declaration would be better read as: Assume there is such a class.
And it doesn't mean that full definition of the class will, or will not, be coming later. It's
full definition might need to come later for successful compilation. Or
not. And if the full class definition does need to come later,
it will need to come for successful compilation; therefore at compiletime, not linktime.
The undefined reference linkage error that you are able to provoke
by commenting out Second second;
in main.cpp
is simply a
plain old undefined reference error such as you'll always get
be trying to link a program in which a variable declared extern
is referenced somewhere and defined nowhere. It has no essential
connection with the extern
variable being of class type - rather
than, say, int
- or with the business of forward class declaration.
Forward declaration of classes is only ever necessary to preempt
a deadlock when the compiler attempts to parse the definitions of
of two classes that are interdependent and is unable to complete
either class definition before it completes the other one.
An elementary example: I naively write two classes first
and second
, of which
each has a method that uses an object of the other class and calls
one of its methods:
first.h
#ifndef FIRST_H
#define FIRST_H
#include <string>
#include <iostream>
#include "second.h"
struct first {
std::string get_type() const {
return "First";
}
void use_a_second(second const & second) const {
std::cout << second.get_type() << std::endl;
}
};
#endif
second.h
#ifndef SECOND_H
#define SECOND_H
#include <string>
#include <iostream>
#include "first.h"
struct second {
std::string get_type() const {
return "First";
}
void use_a_first(first const & first) const {
std::cout << first.get_type() << std::endl;
}
};
#endif
main.cpp
#include "first.h"
#include "second.h"
int main()
{
first f;
second s;
f.use_a_second(s);
s.use_a_first(f);
return 0;
}
Try to compile main.cpp
:
$ g++ -c -o main.o -Wall -Wextra -pedantic main.cpp
In file included from first.h:6:0,
from main.cpp:1:
second.h:13:19: error: ‘first’ has not been declared
void use_a_first(first const & first) const {
^~~~~
second.h: In member function ‘void second::use_a_first(const int&) const’:
second.h:14:22: error: request for member ‘get_type’ in ‘first’, which is of non-class type ‘const int’
std::cout << first.get_type() << std::endl;
^~~~~~~~
main.cpp: In function ‘int main()’:
main.cpp:9:8: error: expected unqualified-id before ‘.’ token
second.use_a_first(first);
The compiler is stymied, because first.h
includes second.h
, and
vice versa, so it can't get the definition of first
before it
gets the definition of second
, which requires the definition of first
...
and vice versa.
A forward declaration of each class before the definition of the
other one, and a correspending refactoring of each class into
a definition and an implementation, gets us out of this deadly embrace:
first.h (fixed)
#ifndef FIRST_H
#define FIRST_H
#include <string>
struct second; // Declaration
struct first{
std::string get_type() const {
return "first";
}
void use_a_second(second const & second) const;
};
#endif
second.h (fixed)
#ifndef SECOND_H
#define SECOND_H
#include <string>
struct first; //Declaration
struct second{
std::string get_type() const {
return "second";
}
void use_a_first(first const & first) const;
};
#endif
first.cpp (new)
#include <iostream>
#include "first.h"
#include "second.h"
void first::use_a_second(second const & second) const {
std::cout << second.get_type() << std::endl;
}
second.cpp (new)
#include <iostream>
#include "first.h"
#include "second.h"
void second::use_a_first(first const & first) const {
std::cout << first.get_type() << std::endl;
}
Compile:
$ g++ -c -o first.o -Wall -Wextra -pedantic first.cpp
$ g++ -c -o second.o -Wall -Wextra -pedantic second.cpp
$ g++ -c -o main.o -Wall -Wextra -pedantic main.cpp
Link:
$ g++ -o prog main.o first.o second.o
Run:
$ ./prog
second
first
This is the only scenario for which forward class declaration is
needed. It can be used in wider circumstances: see When can I use a forward declaration?. The need is only every a need
for successful compilation, not linkage. Linkage can't be attempted till
compilation succeeds.
The documentation snippet is also misleadingly imprecise in the use of the word definition. The
definition of a class means one thing in the context of compilation and that's
what it should mean in the interest of clarity. It means something else, loosely,
in the context of linkage and it shouldn't mean that in the interest of clarity.
In the context of linkage, we'd better only talk about the implementation of
a class - and even that is a notion that begs for qualification.
As far as the compiler is concerned a class is defined if it gets from
the start to the end of:
class foo ... {
...
};
without error, and then the class definition is the contents of that span. A complete definition
does not mean, of course, that a class has a complete implementation. It
only has that if, in addition to a complete definition, all the methods and
static members that are declared in its definition are also themselves defined somewhere, either
in-line within the class definition; out-of-line in a containing translation
unit, or in other translation units (possibly compiled in external
libraries) with which the compiled containing translation unit gets linked.
If any of those member definitions are not provided in one of those ways
come link-time, an unresolved reference linkage error will result. That
is a deficit of the class implementation.
The linker's idea of definition is different from the C++
compiler's and more elementary. From the linker's point of view,
a C++ class doesn't actually exist. For the linker, the class implementation is boiled down, by the compiler,
to a bunch of symbols and symbol definitions not essentially different from what it gets
from any language compiler, whether or not the language deals in classes at all.
What matters to the linker, for success, is that all the symbols that are referenced in the output binary
have definitions either in the same binary or in dynamic libraries requested
in the linkage. A symbol (broadly) can identify some executable code or some data.
For a code symbol, definition means implementation to the linker: the definition is the represented code, if any.
For a data symbol, definition means value to the linker: it means the represented data, if any.
So when the snippet says:
.. and its full definition will be "coming later" (either in the current file, at compile time, or from some other file at link time)
this needs to picked apart.
The full definition of class foo
must be come later in the compilation of
a translation unit, before type foo
is required as the type of anything else,
specifically, the type of a base class, or function/method argument, or object1.
If this requirement is not satisfied a compile error will result:-
- A class cannot be fully defined if any base class is not fully defined.
- A function or method cannot be fully defined if it has an argument of a type
that is not fully defined.
- An object cannot exist of any type that is not fully defined.
If foo
is never required later to be the type of a base class, argument or object,
then the definition of class foo
need never follow the declaration.
The full implementation of class foo
may or may not be required, or
provided, by the linkage. Since the linker doesn't know about classes,
it doesn't know any distinction between a full implementation of a class from an incomplete one.
You can change class first
, above, by adding a method that has no implementation:
struct first{
std::string get_type() const {
return "first";
}
void use_a_second(second const & second) const;
void unused();
};
and the program will compile, link and run just the same. Since the
compiler emits no definition of void first::unused()
, and since
the program does not attempt to invoke void first::unused()
on
any object of type first
, or to use its address, no mention of
void first::unused()
appears in the linkage at all. If
we change main.cpp
to:
#include "first.h"
#include "second.h"
int main()
{
first f;
second s;
f.use_a_second(s);
s.use_a_first(f);
f.unused();
return 0;
}
Then the linker will find a call to void first::unused()
in main.o
and of course give an unresolved reference error. But this just
means that the linkage fails to provide an implementation that the
program needs. It doesn't mean that the class definition of
first
is incomplete. If it was, compilation of main.cpp
would have
failed, and no linkage would have been attempted.
Takeway:-
Forward class declaration can avert compiletime deadlock of
mutually dependent class definitions, with consequential refactoring.
A forward class declaration can't avert an unresolved reference linkage
error. Such an error always means that the implementation of
a code symbol, or the value of a data symbol, is needed by the program
and not provided by the linkage. A class declaration cannot add either
one of those things to the linkage. It adds nothing to the linkage. It
just directs the compiler to tolerate foo
in contexts
where where it is necessary and sufficient for foo
to be a class-name.
Linkage cannot provide any part of a class definition at linktime
if, after a forward class declaration, the class definition becomes
required, because a complete class definition will be required at compiletime or
not at all. Linkage cannot provide parts of a class definition at all;
only elements of the class implementation.
[1] To be clear:
class foo;
foo & bar();
...
foo * pfoo;
...
foo & rfoo = bar();
can compile, with merely the declaration of class foo
, because neither
foo * pfoo
or foo & rfoo
requires an object of type foo
to exist:
a pointer-to-foo, or reference-to-foo, is not a foo
,
But:
class foo;
...
foo f; // Error
...
foo * pfoo;
...
pfoo->method(); // Error
can't compile, because f
must be a foo
, and the object addressed by pfoo
must exist, and therefore be a foo
, if any method is invoked through that
object.