0

According to the C++ standard, calling a member function (in)directly of X before all bases of X are constructed results in undefined behaviour (draft n4910 §11.9.3 Initializing bases and members [class.base.init]/16). They provide following example:

class A {
public:
  A(int); 
};

class B : public A {
  int j;
public:
  int f();

  B() : A(f()), // undefined behavior: calls member function but base A not yet initialized
        j(f())  // well-defined: bases are all initialized
  {}
};

What is the rationale behind this? I assume it results in undefined behavior in case f would access a member of A, because that member would not have been initialized yet. Are there other cases why this would result in undefined behaviour?

Edit: I understand why in the given example the first call to f is undefined behavior. However, I'm wondering what the rationale is for this. In other words: why is this defined as undefined behavior?

Assume that the definition of f is as follows:

int B::f() {
  return 0;
}

I would expect that this gets translated by most compilers to a function as follows:

int B::f(B *b) {
  return 0;
}

This member function would never access any data member of B. Hence, I wouldn't expect any undefined behaviour.

Now, consider f has following definition:

int B::f() {
  return this->j;
}

Which would get translated to something like this:

int B::f(B *b) {
  return b->j;
}

This clearly accesses an uninitialized member of B. Hence, undefined behaviour is expected.

To wrap it up: is the statement in the standard too general, or am I missing something and would both examples result in undefined behavior?

Niels
  • 11
  • 2

2 Answers2

-1

One way to think about inheritance is that the derived class B has all of the properties of the base class A with some extra data and methods appended to the end. The data members of classes are constructed in the order of their declaration. So, when a program creates an instance of B, it has to create all the members of A first.

In Anoop Rana's answer, the quoted portion of the C++ standard says that "referring to any non-static member or base class of the object before the constructor begins execution results in undefined behavior." The problem with A(f()) in B's initializer list is that the data member's of A and B have not been constructed and will be referenced in the call to f().

  • How do I know that the members of B have not been constructed yet?

    • Because the constructor to A is being called, which means that the members of A have not been constructed, and these constructions must finish before the construction of B's members can begin.
  • How do I know that B::f() will reference data members of B and/or A?

    • Because B::f() is not a static method. Notice that calling static methods is perfectly fine before a constructor runs because static methods can only reference static members of a class, and static members are created on program start, so they are already initialized.
Mark H
  • 585
  • 3
  • 11
  • But then why is `j(f())` allowed? B's members aren't initialized yet, and you can access them from `f()` as well. – HolyBlackCat Jun 24 '22 at 17:42
  • @HolyBlackCat Here's one possibility. The difference is that `f()` has access to the private membes of `B`, so that method could assign values to the uninitialized members as a part of running. So, `j(f())` is only potentially undefined behavior, whereas `A(f())` definitely is. – Mark H Jun 25 '22 at 04:16
  • It's not straightforward why `A(f())` has to be UB if we assume `f()` doesn't access any members. – HolyBlackCat Jun 25 '22 at 08:31
  • @HolyBlackCat If we assume that `f()` does not access any members, then it is a static method and should be marked as such. Calling static methods is fine in this context. – Mark H Jun 25 '22 at 10:31
  • I'm talking about a non-static member function, whether it *should* be static or not doesn't really matter. I'll try to rephrase: `j(f())` is legal if `f()` doesn't access uninitialized members (it can access members that were already initialized). Then why `A(f())` can't be legal if `f()` doesn't access any members of `B` or `A` (it could access members of bases initialized before `A`, if there were any). – HolyBlackCat Jun 25 '22 at 11:14
  • @HolyBlackCat The problem of determining whether a method accesses uninitialized members can be difficult to determine, especially if the definition of `f()` is not available when the constructor call is compiled. It may be impossible due to Rice's Theorem. So, the writers of the C++ standard probably decided on a compromise. Calling a derived class's non-static method before the base class's constructor is almost certainly a mistake for the reasons given, so it is not allowed. Calling a class's non-static method in its own constructor is less likely to be undefined behavior, so it is allowed. – Mark H Jun 25 '22 at 12:39
-2

I think that the difference can be understood/explained using class.cdtor#1 which states:

For an object with a non-trivial constructor, referring to any non-static member or base class of the object before the constructor begins execution results in undefined behavior. For an object with a non-trivial destructor, referring to any non-static member or base class of the object after the destructor finishes execution results in undefined behavior.

(emphasis mine)

Now, we can apply this to the given example. In particular, in the first case A(f()) the execution of the base class' ctor A::A(int) has begun(and not the derived class') so referring to the non-static member function f of derived class B is undefined behavior according to the above quoted statement. Note also that as f() is the passed argument, the call to f happen before the construction of A has begun.

While in the second case j(f()) the execution of the derived class' ctor has begun and so referring to the non-static member function f of the same derived class is valid now.

To clarify more, in our example the construction of derived object happens in two steps. First the base portion is constructed first using/calling the base' ctor and then only the derived construction begins using/calling the derived ctor. Source.

Also from class.base.init:

13 In a non-delegating constructor, initialization proceeds in the following order:

13.1 First, and only for the constructor of the most derived class ([intro.object]), virtual base classes are initialized in the order they appear on a depth-first left-to-right traversal of the directed acyclic graph of base classes, where “left-to-right” is the order of appearance of the base classes in the derived class base-specifier-list.

13.2 Then, direct base classes are initialized in declaration order as they appear in the base-specifier-list (regardless of the order of the mem-initializers).

13.3 Then, non-static data members are initialized in the order they were declared in the class definition (again regardless of the order of the mem-initializers).

13.4 Finally, the compound-statement of the constructor body is executed.

(emphasis mine)


In conclusion, calling a member function during the construction of the object is allowed but the problem with A(f()) was that the object under construction(whose ctor began executing) was A's type but f belong to the derived class B.

While in the second case j(f()) this was not the case and hence this case was well-formed.

class A {
public:
  A(int); 
};

class B : public A {
  int j;
public:
  int f();

  B() : A(f()), // execution of derived class' ctor has not started so calling f is UB 
        j(f())  // execution of derived class' ctor has begins so calling f is now valid
  {}
};
Jason
  • 36,170
  • 5
  • 26
  • 60
  • 1
    In `A(f())`, the call to `f()` happens-before the construction of `A` has begun, as it's an argument to the constructor – Caleth Jun 24 '22 at 10:24
  • @Caleth Yes, i've added it in my answer in case future readers were not aware of the same. – Jason Jun 24 '22 at 10:47
  • 2
    Your explanation of when different constructors begin executing doesn't sound right. I'm fairly sure the derived constructor starts executing first, and calls the base constructor (after evaluating its arguments). `[class.cdtor]/1` then just says that you can't access a class before its lifetime starts. Do you have a reference for derived constructor starting to execute after constructing bases, but before constructing members? – HolyBlackCat Jun 24 '22 at 17:38
  • 1
    @HolyBlackCat As a derived object is really two parts: a base part and a derived part, when constructing derived objects it(construction) happens in 2 phases. First the ctor of the base class is called and then only the derived class' ctor is called. That is, once the base's portion is finished then only the derived portion is constructed. Here is the [source1](https://www.learncpp.com/cpp-tutorial/order-of-construction-of-derived-classes/) and [source2](https://stackoverflow.com/a/1640424/12002570). – Jason Jun 25 '22 at 05:06
  • 2
    Those two look like sloppy writing. Of course the *body* of the base constructor is entered first, but it doesn't mean the the base constructor itself is entered first. Note that [`[class.base.init]/13`](http://eel.is/c++draft/class.init#class.base.init-13) doesn't say anything about "member init being a part of the constructor, unlike base class init". Also note that the function-try-block of the derived constructor (if any) covers the base class init too. – HolyBlackCat Jun 25 '22 at 08:27
  • Also *somebody* needs to evaluate the arguments of the base constructors before calling them, and the natural conclusion is that it's the derived constructor that does it. – HolyBlackCat Jun 25 '22 at 08:30