3

Context

My goal is to have a base container class that contains and manipulates several base class objects, and then a derived container class that contains and manipulates several derived class objects. On the advice of this answer, I attempted to do this by having each contain an array of pointers (a Base** and a Derived**), and cast from Derived** to Base** when initializing the base container class.

However, I ran into a problem – despite compiling just fine, when manipulating the contained objects, I'd have segfaults or the wrong methods would be called.


Problem

I have boiled the problem down to the following minimal case:

#include <iostream>

class Base1 {
public:
    virtual void doThing1() {std::cout << "Called Base1::doThing1" << std::endl;}
};

class Base2 {
public:
    virtual void doThing2() {std::cout << "Called Base2::doThing2" << std::endl;}
};

// Whether this inherits "virtual public" or just "public" makes no difference.
class Derived : virtual public Base1, virtual public Base2 {};

int main() {
    Derived derived;
    Derived* derivedPtrs[] = {&derived};
    ((Base2**) derivedPtrs)[0]->doThing2();
}

You may expect this to print "Called Base2::doThing2", but…

$ g++ -Wall -Werror main.cpp -o test && ./test
Called Base1::doThing1

Indeed – the code calls Base2::doThing2, but Base1::doThing1 ends up being called. I've also had this segfault with more complex classes, so I assume it's address-related hijinks (perhaps vtable-related – the error doesn't seem to occur without virtual methods). You can run it here, and see the assembly it compiles to here.

You can see my actual structure here – it's more complex, but it ties it into the context and explains why I need something along the lines of this.

Why does the Derived**Base** cast not work properly when Derived*Base* does, and more importantly, what is the right way to handle an array of derived objects as an array of base objects (or, failing that, another way to make a container class that can contain multiple derived objects)?

I can't index before upcasting (((Base2*) derivedPtrs[0])->doThing2()), I'm afraid, since in the full code that array is a class member – and I'm not sure it's a good idea (or even possible) to cast manually in every place contained objects are used in the container classes. Correct me if that is the way to handle this, though.

(I don't think it makes a difference in this case, but I'm in an environment where std::vector is unavailable.)


Edit: Solution

Many of the answers suggested that casting each pointer individually is the only way to have an array that can contain derived objects – and that does indeed seem to be the case. For my particular use case, though, I managed to solve the problem using templates! By supplying a type parameter for what the container class is supposed to contain, instead of having to contain an array of derived objects, the type of the array can be set to the derived type at compile time (e.g. BaseContainer<Derived> container(length, arrayOfDerivedPtrs);).

Here's a version of the broken "actual structure" code above, fixed with templates.

obskyr
  • 1,380
  • 1
  • 9
  • 25
  • Virtual base is really something. – felix Feb 11 '19 at 14:16
  • Compare `(Base2*) &derived`, `static_cast(&derived)`, `(Base1*) &derived`, and `static_cast(&derived)`. – molbdnilo Feb 11 '19 at 14:25
  • Due to things like vtable and multiple inheritance, pointer to base may have different value than pointer to derived class, even when it's the same instance. You must cast so that the cast operation (static_cast usually) can adjust the pointer as needed. – hyde Feb 11 '19 at 14:30
  • 4
    Why are you casting at all? Given you example, since both `doThing1()` and `doThing2()` are virtual, no casting is needed. That's the entire point of virtual member functions. – You Feb 11 '19 at 14:33
  • Yes, `(Base 2 **)derivedPtrs)[0]` is awful, since there is no guarantee that `Derived **` can be converted to a `Base2**` and your code has undefined behaviour. You probably would be better off (assuming you know that `derivedPtrs[0]` actually points at a `Derived`) doing `(Base 2 *)(derivedPtrs[0])->doThing2()` or (more explicitly) `static_cast(derivedPtrs[0])->DoThing2()` or (option of run time checking that the conversion is valid) `p = dynamic_cast(derivedPtrs[0]); if (p) p->DoThing2()`. – Peter Feb 11 '19 at 14:36
  • @You Because in the real-world code, that array is actually in a `BaseContainer` class with a `Base** _objects` member, which has the derived `DerivedContainer` with `Derived** _objects`. `DerivedContainer` needs to use the methods from `BaseContainer` + be upcastable, and `BaseContainer` needs to be able to contain `Derived`s. – obskyr Feb 11 '19 at 14:36
  • 1
    @obskyr In that case you're not doing inheritance right, and your example also does not illustrate your problem in sufficient detail. As formulated, the answer to your question is "don't cast". If you're trying to make your own `std::variant`, a tagged union would probably be a more useful solution. – You Feb 11 '19 at 14:38
  • @You Do you happen to know a simple way to do "derived container containing derived objects", then? [Here is a more elaborate example](https://repl.it/repls/GrandioseExoticKnowledge) that reflects my actual structure pretty much 1:1, and has the same problems. – obskyr Feb 11 '19 at 14:59
  • 1
    @obskyr: No, because that doesn't sound like a thing that is usefully modeled by inheritance at all. There is no *is-a* relationship between "container of Base1" and "Base1", and modeling such a relationship therefore makes no sense. – You Feb 11 '19 at 15:20
  • @You Oh, but there is such a relationship – this `BaseContainer1` delegates function calls to a group of `Base1` objects, and needs to be usable wherever a `Base1` is. The actual names of these is `InputChannelSet` and `InputChannel`, so you can imagine why `BaseContainer1` needs to have an is-a relationship to `Base1`. – obskyr Feb 11 '19 at 15:22
  • 1
    I disagree; a "set of X" is not "an X"; the fact that you happen to apply some common operations to both of them does not change this fact. You could just as well have an overload set `foo(InputChannel&)` and `foo(InputChannelSet&)`, which would allow you to accomplish the same thing without imposing a (non-existent) *is-a* relationship between `InputChannel` and `InputChannelSet`. – You Feb 11 '19 at 15:27

5 Answers5

2

There are many things that make this code pretty awful and contribute to this issue:

  1. Why are we dealing with two-star types in the first place? If std::vector doesn't exist, why not write your own?

  2. Don't use C-style casts. You can cast pointers to completely unrelated types into each other and the compiler is not allowed to stop you (coincidentally, this is exactly what is happening here). Use static_cast/dynamic_cast instead.

  3. Let's assume we had std::vector, for ease of notation. You are trying to cast a std::vector<Derived*> to std::vector<Base*>. Those are unrelated types (the same is true for Derived** and Base**), and casting one to the other is not legal in any way.

  4. Pointer casts from/to derived are not necessarily trivial. If you have struct X : A, B {}, then a pointer to the B base will be different from a pointer to the A base (and with a vtable in play, possible also different from a pointer to X). They must be, because the (sub)objects cannot reside at the same memory address. The compiler will adjust the pointer value when you cast the pointer. This of course does not/cannot happen for each individual pointer if you (try to) cast an array of pointers.

If you have an array of pointers to Derived and want to get an array of those pointers to Base, then you have to manually cast each one. Since the pointer values will generally be different in both arrays, there is no way to "reuse" the same array.

(Unless the conditions for empty base optimization are fulfilled, which is not the case for you).

Max Langhof
  • 23,383
  • 5
  • 39
  • 72
  • So would you say that the best way to solve my container inheritance problem ([see real-world structure here](https://repl.it/repls/GrandioseExoticKnowledge)) would be to, on initialization of the derived container, manually cast each pointer in a for loop and initialize the base container with that? – obskyr Feb 11 '19 at 15:13
  • It's a way to solve that problem. Whether it is the best way to solve it depends on more real-world constraints (e.g. ownership?), and whether this is the right problem to solve (or whether you should maybe avoid a design with so much virtual inheritance) is another story. I can't give you a universal blessing, sorry :) – Max Langhof Feb 11 '19 at 15:43
  • Would OP's code possibly suffer from object slicing as well? – Kristopher Ives Feb 20 '19 at 15:39
  • @KristopherIves Object slicing happens when the wrong assignment operator (the one of the static type) is used to assign instances of objects with a different dynamic type, e.g. `Base& refToBase = derivedInstance; refToBase = otherDerivedInstance;` (only the `Base` part of `otherDerivedInstance` will be copied), or a similar situation for `delete`. Since there is no such assignment or deletion in the presented code, object slicing is not technically present. Lack of a virtual destructor means it could become a problem, but I have faith that the example does not include it so as to be minimal. – Max Langhof Feb 20 '19 at 16:02
2

Like others said the problem is that you are not letting the compiler doing its job in adjusting indices, suppose Derived layout in memory is something like (not guaranteed by the standard, just a possible implementation):

| vtable_Base1 | Base1 | vtable_Base2 | Base2 | vtable_Derived | Derived |

Then &derived points to the start of the object, when you normally do

Base2* base = static_cast<Derived*>(&derived)

the compiler knows the offset which Base2 structure has inside Derived type and adjusts the address to point to the start of it.

If, instead, you cast an array of pointers directly the compiler is coercing the type, assuming that your array is storing pointers to Base2 already, but without adjusting them.

A dirty hack which may work or not in your situation is having a method which returns the pointer to itself, eg:

class Base2 {
public:
  Base2* base2() { return this; }
}

so that you can do derivedPtrs[0]->base2()->doThing2().

Jack
  • 131,802
  • 30
  • 241
  • 343
  • Great answer! I would only state more explicitly that the layout you showed is not guaranteed for all compilers/architectures. – Max Langhof Feb 11 '19 at 14:56
  • Ah! Understanding how a vtable can work internally actually helped quite a bit. Gotcha – so that's what happens when you upcast a pointer. Now to figure out how to solve this in my actual context – I'll see if the hack you suggested helps. – obskyr Feb 11 '19 at 15:05
1

It doesn't work the same reason why this don't work:

struct Base {};

struct Derived : Base { int i; };

int main() {
    Derived d[6];
    Derived* d2 = d;
    Base** b = &d2; // ERROR!
}

Your c-style cast is a bad practice because it didn't warn you of the error. Just don't do that to your code. Your c-style cast was actually a reinterpret_cast in disguise, which is tottaly wrong in the case.

But why cannot you cast an array to derived into an array to base? Simple: they have different layout.

You see, when you iterate on an array of a type, each elements in the array are contiguous in memory. The Derived class may have a size of let's say, 24 bytes, and Base a size of 8:

Derived d[4];
------------------------------------------------------
|     D1     |     D2     |     D3     |      D4     |
------------------------------------------------------

Base b[4];
---------------------
| B1 | B2 | B3 | B4 |
---------------------

As you can see, Derived[4] and Base[4] are different types with different layouts.


Then what can you do?

There's many solution in fact. The easiest would be to create a new array of pointer to base, and cast each derived to base pointers. You have to ajust the pointer of each objects anyway.

It would look like this:

std::vector<Base*> bases;
bases.reserve(std::size(derived_arr))

std::transform(
    std::begin(derived_arr), std::end(derived_arr),
    std::back_inserter(bases), 
    [](Dervied* d) {
        // You must use dynamic cast because the 
        // pointer offset in only known at runtime
        // when using virtual inheritance
        return dynamic_cast<Base*>(d); 
    }
);

The other solution that is in place in memory would be to create you own type of iterator that would do the cast when calling the operator* and the operator->. This is a bit harder to do but can save you an allocation by making iteration a bit slower.


On top all that, this may be unrelated, but I would advise against virtual inheritance. This is not a good practice and in my experience resulted in more pain than anything. I would suggest using the adaptator pattern and wrap non-polymorphic types instead.

Guillaume Racicot
  • 39,621
  • 9
  • 77
  • 141
  • C-Style cast is `reinterpret_cast>`, not just `reinterpret_cast`. – Mark Ingram Feb 11 '19 at 14:50
  • 1
    @MarkIngram a C-style cast *can be* `reinterpret_cast>`. In this case there isn't a `const_cast` – Caleth Feb 11 '19 at 14:51
  • @MarkIngram It's not always a `reinterpret_cast`. Sometimes it is a `static_cast`. It depends on the types being casted. In this case, there is no `const_cast`. – Guillaume Racicot Feb 11 '19 at 14:52
  • 1
    Yes, there isn't one here, because the mutability hasn't changed - I'm merely pointing out that a C-Style cast has the power to do both, for OPs benefit. – Mark Ingram Feb 11 '19 at 15:02
0

When you cast from Derived* to Base* the compiler will adjust the value. When you cast Derived** to Base** you defeat that.

This is a good reason to always use static_cast. Example error with your code changed:

test.cpp:19:5: error: static_cast from 'Derived **' to 'Base2 **' is not
      allowed
    static_cast<Base2**>(derivedPtrs)[0]->doThing2();
janm
  • 17,976
  • 1
  • 43
  • 61
  • "When you cast from `Derived*` to `Base*` the compiler will adjust the value" – how and why? And why not for `Derived**` to `Base**`? And… do you happen to know what I could do instead to solve this problem? – obskyr Feb 11 '19 at 14:27
  • 2
    @obskyr "Why?" -> Subobjects come one after another in the derived object, so they must have different (increasing, if the subobjects are non-empty) pointer values. Your `Base` objects have no subobjects themselves, but they have a vtable, so they are non-empty. "How?" -> The compiler knows where `Base` is placed inside `Derived`. – Max Langhof Feb 11 '19 at 14:37
  • Thanks for explaining! – obskyr Feb 11 '19 at 18:20
0

(Base2 **) is in fact a reinterpret_cast (You can confirm this by trying all four casts) and that expression causes UB. The fact that you can implicitly cast a pointer to derived to a pointer to base doesn't mean that they are the same, e.g. int and float. And here, you are referring an object by the type which it isn't, in this case, causes an UB.

Calling a virtual function this way results the final overrider of that object being called. How this is achieved depends on the compiler.

Assume that the compiler uses a "vtable", finding the vtable of Base2 (might be adding an offset to the memory. assembly) from the memory address of a Derived is not trivial.

Basically, you must perform a dynamic cast on every pointer, store them somewhere or dynamic cast it when in need.

felix
  • 2,213
  • 7
  • 16
  • "the final overrider of that type" doesn't make sense, you probably meant the final overrider of _some_ virtual function declared that class (which one depends on the exact layout) – curiousguy Mar 08 '19 at 00:25