3

In the following code snippet, I defined a nested class Inner inside an encloding class Outer. The inner class is placed inside a private section at the end of the enclosing class.

class Outer {
public:
    Outer(): in(new Inner()) {  // Case 1: OK
        cout << "Outer ctor is called." << endl;
    }
    
    void f() {
        Inner in2;       // Case 2: OK
        cout << "f() is called." << endl; 
    }
    
    Inner *in1;          // Case 3: error: 'Inner' does not name a type
    
    void g(Inner in3) {  // Case 4: error: 'Inner' does not name a type
        cout << "g() is called." << endl;
    }

    int y = x + 2;   // Case 5: OK to use a data member x defined in a later section

private:
    class Inner {};
    Inner *in;
    int x = 1;
};

int main() {
    Outer out;
    out.f();
    return 0;
}

Let's consider the following 5 cases:

  • Case 1: use the Inner class in the constructor initializer list -- OK!
  • Case 2: use the Inner class as a type to define a local variable inside a member function -- OK!
  • Case 3: use the Inner class as a type to define a data member -- error: 'Inner' does not name a type
  • Case 4: use the Inner class as a method type -- error: 'Inner' does not name a type
  • Case 5: useage of a regular class data member is allowed before the definition of the data member in a different section.

I understand the errors I got are related to the declaration order: the Inner class's declaration/definition appers after the public section that uses it. So if we switch the order of the private and public sections such that the Inner class definition appears at the begining of the enclosing class, then all errors are gone. This is also pointed out by the answer to a similar question Can't use public nested class as private method parameter. But it is still unclear to me:

  1. Why Case 1-2 work fine but Case 3-4 failed with error? what are the differences between Cases 1-2 and Cases 3-4? I know this is related to the declration order of the Inner class relative to the code that uses it. But why does the declaration order matter in some cases and not matter in other cases?

  2. For Case 5 involving two regular class data members x and y defined in two different sections (one public, one private), it is fine to use the data member x by y even before x's definition. In more genral terms, no matter what the relative order of two (public/private) sections is, the code in one section may access the data members defined in the other section. But in the case of nested class member, we may run into errors if we use a nested class before its declaration section (such as in Case 3-4). So why do we have such a difference between a regular class member and a nested class member?

Peng
  • 1,393
  • 13
  • 19
  • 1
    The "sections" have nothing to do with it. It's all about order. Unfortunately I don't know enough to tell you why order sometimes matter and sometimes doesn't, but if you were to get rid of the `public:` and `private:` labels you would still see the same sort of thing. – Ken Wayne VanderLinde Jun 19 '22 at 02:44
  • 2
    One of the exceptions to C++'s usual rule of "the compiler goes from top to bottom, once only" is that the inside of a class member function is a complete-class context, which means that it's deferred until the entirety of a class's definition is known. This means that inside the body of a class function, you can still refer to things that are declared later in the class's definition. That should account for your questions 1 and 2. I don't fully understand what you're asking for question 3. – Nathan Pierson Jun 19 '22 at 03:30
  • @NathanPierson Thanks for your comment. I have removed question 3 as it is a bit vague. I think your comment fully answers my question 1. Regarding question 2, could you elaborate more on why the compiler treats a nested class differently from a regular class member? It would be great if you could add a formal answer instead of just a comment so that it can be more easily referenced. – Peng Jun 19 '22 at 04:17
  • Actually, I'm not sure what you mean by treating a nested class member differently. The definition of the member needs to have the type visible before it's used. But inside the body of a function, you can use a nested class or a class member even if they don't appear until later on in the class. Can you give an example specifically of that different treatment? – Nathan Pierson Jun 19 '22 at 04:37
  • @NathanPierson I have added an example Case 5 in the code, and rephrased question 2. I hope it is clear now. – Peng Jun 19 '22 at 13:58
  • Okay, that's an interesting question. [Toying with it in Godbolt](https://godbolt.org/z/3fqo9b84a), I see that I actually totally _can_ give a member variable a default member initializer that refers to a member variable of nested class type where the declaration of the nested type and the definition of the variable appear after the variable. But I can't do something like `int z = laterInner{}.z;`, but also that's not very analogous to `int y = x + 2;` anyway because `x` is a member variable not a type. That might be worth carving out into its own question. – Nathan Pierson Jun 19 '22 at 14:07
  • 1
    Also FWIW it appears that the "fine" `int y = x + 2;` case and the `int z = inner_z.z;` case are not _actually_ reliably initializing `y` and `z` (demo [here](https://godbolt.org/z/roTedrhhG)) because member variables are initialized in the order they're declared so `y = x + 2;` tries to initialize `y` with the contents of an uninitialized `int` leadaing to undefined behavior. – Nathan Pierson Jun 19 '22 at 14:11
  • @NathanPierson I agree. My case 5 and your demo code (using a member variable before it is initiailized) run into undefined behavior (the variable `x` is accessed by `y = x + 2` before `x` is initialized, leading to undefined behavior). Gcc provides essentially random value without any warning, but Clang would report an warning instead (warning: field 'x' is uninitialized when used here [-Wuninitialized]). – Peng Jun 20 '22 at 15:55
  • @Peng It's possible to get GCC to warn about it as well, `-Wuninitialized` is also a GCC warning that's included in `-Wall`. – Nathan Pierson Jun 20 '22 at 16:02
  • @NathanPierson Now back to my very begining question on inner class, according to our discussion, we can't use a later defined inner class to define a member variable because the inner class is undefined at the time we use it as a type to declare/define a class member. But by the time the execution enters inside a member function, as you pointed out, the entirety of the class including the "later" defined inner class is already defined (as the function body is deferred until the entire class defition is ready), so we can use the inner class inside the function body. – Peng Jun 20 '22 at 16:03

2 Answers2

4

Thanks to @NathanPierson's comments and thoughtful discussions, I have done some more research and I decide to provide my own answer.

First of all, let me describe some facts about C++ compilation:

  1. When compiling a class, the compiler goes from top to bottom, only once, analyzing all declarations inside the class body. This includes declarations of data members and member functions.

  2. The declaration of a member function includes the parameter types, return types, and function name. But the body of a member function is not part of a declaration, and it belongs to the function definition.

  3. Quoted from IBM documentation:

The body of a member function is analyzed after the class declaration so that members of that class can be used in the member function body, even if the member function definition appears before the declaration of that member in the class member list.

Quoted from C++ Primer, 5th Edition, p. 259:

the compiler processes classes in two steps—the member declarations are compiled first, after which the member function bodies, if any, are processed. Thus, member function bodies may use other members of their class regardless of where in the class those members appear.

In other words, the definition of the member function is deferred until all class declarations are complete, thus, the member function body is free to use all class members, regardless of whether the class members are declared before or after the member function.

  1. Quoted from C++ Primer, 5th Edition, p. 279:

A class must be defined — not just declared — before we can write code that creates objects of that type. Otherwise, the compiler does not know how much storage such objects need.

Now back to the code:

  • Cases 1-2 are OK. This is because the definition of the member function body is deferred after all class declarations, so it is fine to use any class members, including a nested class, in the member function body.

  • Case 3 is in error. This is immediately explained by fact 1 that the compiler goes from top to bottom. When the compiler reaches the line of code in Case 3, it hasn’t seen any definition for type Inner, so we get an error: 'Inner' does not name a type.

  • Case 4 is in error: Inner is used as a parameter type for the member function g(Inner), so parameter type Inner is part of the member function declaration. Then according to Fact 4, it requires that the type Inner must have already been defined, which is obviously not true because Inner’s definition appears after.

  • Case 5 (int y = x + 2) runs into undefined behaviour because the variable x is accessed before it is initialized.

Peng
  • 1,393
  • 13
  • 19
  • Since a pointer is used in case 3, you could avoid the error by declaring it first. `class Inner; Inner* in1;` – Burak Jun 22 '22 at 04:36
  • That's true. We can use an incomplete type (declared but not yet defined) to define pointers or references. So `class Inner; Inner *in1;` makes use of forward declaration to make the decclaration of pointer valid. This is also explained on p. 279 of C++ Primer book. – Peng Jun 22 '22 at 11:09
-2

Case 1 and 2 have the variable in the function body (including member initialization in that) while case 3 and 4 goes towards the interface of the class. The body of the functions are compiled after the whole class declaration is parsed so use of the Inner type works there. But the member variable and function argument Inner is part of the class declaration and can only have already known types.

It's a bit like the "hoisting" except you hoist both variables and functions to the top.

When you write Outer() : in(new Inner) { } then that turns into:

class Outer {
    Outer();
    class Inner {};
    Inner *in;
};
Outer::Outer() : inner(new Inner) { }

as you can see in that form the Inner class is declared before you use it.

That should also explain how having multiple public and private sections has no effect on the whole thing. The whole class gets declared fully before any of the access checks have to be done in the function bodies.

Just reorder things so they happen in the right order:

class Outer {
    class Inner {};
public:
    Outer(): in(new Inner()) {  // Case 1: OK
        cout << "Outer ctor is called." << endl;
    }
    
    void f() {
        Inner in2;       // Case 2: OK
        cout << "f() is called." << endl; 
    }
    
    Inner *in1;          // Case 3: error: 'Inner' does not name a type
    
    void g(Inner in3) {  // Case 4: error: 'Inner' does not name a type
        cout << "g() is called." << endl;
    }

private:
    Inner *in;
};

int main() {
    Outer out;
    out.f();
    return 0;
}
Goswin von Brederlow
  • 11,875
  • 2
  • 24
  • 42
  • Thx. But I am aware of this and this was already described in my question. Could you answer my actual questions? – Peng Jun 19 '22 at 02:32
  • @Peng . The problem with that the question are either too broad or answer is in question. To answer why one can say "because it's standard is that way", and why standard is tha that way? because compiler creators decided it. Unless you want quotes from standard (then language-lawyer tag would be appropriate) – Swift - Friday Pie Jun 19 '22 at 03:16