reduce size of object (wasted) in Multi virtual inheritance

Question

After profiling, I found that a large portion of memory of my program are wasted by multi-virtual-inheritance.

This is MCVE to demostrate the problem ( http://coliru.stacked-crooked.com/a/0509965bea19f8d9 )

#include<iostream>
class Base{
    public: int id=0;  
};
class B : public virtual Base{
    public: int fieldB=0;
    public: void bFunction(){
        //do something about "fieldB"     
    }
};
class C : public virtual B{
    public: int fieldC=0;
    public: void cFunction(){
        //do something about "fieldC"     
    }
};
class D : public virtual B{
    public: int fieldD=0;
};
class E : public virtual C, public virtual D{};
int main (){
    std::cout<<"Base="<<sizeof(Base)<<std::endl; //4
    std::cout<<"B="<<sizeof(B)<<std::endl;       //16
    std::cout<<"C="<<sizeof(C)<<std::endl;       //32
    std::cout<<"D="<<sizeof(D)<<std::endl;       //32
    std::cout<<"E="<<sizeof(E)<<std::endl;       //56
}

I hope sizeof(E) to be not much more than 16 bytes (id+fieldB+fieldC+fieldD).
From experiment, if it is non virtual inheritance, E's size will be 24 (MCVE).

How to reduce size of E (by C++ magic, change program architecture, or design pattern)?

Requirement:-

Base,B,C,D,E can't be template class. It will cause circular dependency for me.
I must be able to call base class's function from derived class (if any) e.g. e->bFunction() and e->cFunction(), as usual.
However, it is OK if I can't call e->bField anymore.
I still want the ease of declaration.
Currently, I can declare "E inherit from C and D" as class E : public virtual C, public virtual D easily.

I am thinking about CRTP e.g. class E: public SomeTool<E,C,D>{}, but not sure how to make it works.

To make things easier :

In my case, every class is used like it is monolith, i.e. I will never cast object between types like static_cast<C*>(E*) or vise versa.
Macro is allowed, but discouraged.
Pimpl idiom is allowed. Actually, below is what I daydream.
Perhaps, I may be able to remove all virtual-inheritance.
However, with all the requirements, I can't find a way to code it.
In pimpl, if I make E virtual inherit from C & D, etc, all above requirement will be met, but I will still waste a lot of memory. :-

I am using C++17.

Edit

Here is a more correct description of my real-life problem.
I create a game that has many components e.g. B C D E.
All of them are created via pool. Thus, it enables fast iterating.
Currently, if I query every E from a game engine, I will be able to call e->bFunction().
In my most severe case, I waste 104 bytes per object in E-like class. (real hierarchy is more complex)

Edit 3

Let me try again. Here is a more meaningful class diagram.
I have a central system to assign hpPtr,flyPtr,entityId,componentId,typeId automatically already.
i.e. Don't worry how they are initialized.

In real case, dread diamond happen in many classes, this is the most simple case.

Currently, I call like :-

 auto hps = getAllComponent<HpOO>();
 for(auto ele: hps){ ele->damage(); }
 auto birds = getAllComponent<BirdOO>();
 for(auto ele: birds ){ 
     if(ele->someFunction()){
          ele->suicidalFly();
          //.... some heavy AI algorithm, etc
     }
 }

With this approach, I can enjoy cache coherence as in Entity Component System, and the cool ctrl+space intellisense of HpOO,FlyableOO and BirdOO like in Object-Oriented style.

Everything works fine - it just uses too much memory.

This doesn't sound an actual problem yet, more a programming challenge. Why 40 bytes per instance is considered a large portion of memory? How many of these objects do you need to allocate? Virtual inheritance requires more memory to store pointers to the vtables itself so that's why. — Jack, Jun 10 '19 at 13:07
Virtual inheritance just adds one pointer size to the class, plus alignment of the pointer (if needed). You can't get around that if you actually need a virtual class, however if you want just a function pointer you could implement one or more of the functions in the base class (and make them protected) and declare a member function pointer and initialize that based on the constructor of your derived class.. — xception, Jun 10 '19 at 13:07
Do you need to use `PIMPL` if all your classes store is a single integer? — drescherjm, Jun 10 '19 at 13:10
@Jack Yes, in this example, it sounds too subtle. In real case, I allocate 2000 of it and access them 60 fps. I suspect it causes a lot of cache miss lately. — cppBeginner, Jun 10 '19 at 13:10
Could you provide an example that's a more accurate representation of how you are using your class? in the example provided you don't need virtual at all. — xception, Jun 10 '19 at 13:11
@drescherjm Oh, it is just an example. In real case it is more than that. — cppBeginner, Jun 10 '19 at 13:11
@cppBeginner if you're trying to model a game and have such stringent requirements for your data acesses maybe you could look into data-driven programming instead of using object-oriented programming for your application. find more here https://stackoverflow.com/questions/1065584/what-is-data-driven-programming — xception, Jun 10 '19 at 13:14
@xception Agree. I am trying to model it with strange pool memory management. The most data-driven architecture that I have read fail my 2nd requirements. I love `e->(ctrl+space)` so much. .... I am trying to create the more accurate representation. Thank. — cppBeginner, Jun 10 '19 at 13:17
@xception I have added "the more accurate representation". Hope it clarifies things. XD — cppBeginner, Jun 10 '19 at 13:35
Could you please define what "large" means for you? Because "virtual", of course, does not come for free: a virtual object brings(and not waste) an overhead of an additional, single, pointer to the vtable (usually 4/8 bytes), and it's quite neglectable from my point of view. Here a really nice explanation about virtual classes [link](https://stackoverflow.com/questions/1626290/c-virtual-function-table-memory-cost) edit: after reading all the comments, I think user xception gave you a really good input :) — Mauro Dorni, Jun 10 '19 at 13:05
There was another good solution that show how I lose 32 bytes, but it is (voluntarily) removed. I can't remember its author, so I can't ping him. (I am talking to myself.) — cppBeginner, Jun 11 '19 at 01:54

xception · Accepted Answer · 2019-06-11T10:43:23.083

EDIT: based on latest update to the question and some chatting

Here's the most compact maintaining the virtual in all your classes.

#include <iostream>
#include <vector>

using namespace std;

struct BaseFields {
    int entityId{};
    int16_t componentId{};
    int8_t typeId{};
    int16_t hpIdx;
    int16_t flyPowerIdx;
};

vector<int> hp; // this will contain all the hit points, dynamically resizable, logic up to you
vector<float> flyPower; // this will contain all the fly powers, dynamically resizable, logic up to you

class BaseComponent {
public: // or protected
    BaseFields data;
};
class HpOO : public virtual BaseComponent {
public:
    void damage() {
        hp[data.hpIdx] -= 1;
    }
};
class FlyableOO : public virtual BaseComponent {
public:
    void addFlyPower(float power) {
        flyPower[data.hpIdx] += power;
    }
};
class BirdOO : public virtual HpOO, public virtual FlyableOO {
public:
    void suicidalFly() {
        damage();
        addFlyPower(5);
    }
};

int main (){
    std::cout<<"Base="<<sizeof(BaseComponent)<<std::endl; // 12
    std::cout<<"C="<<sizeof(HpOO)<<std::endl; // 24
    std::cout<<"D="<<sizeof(FlyableOO)<<std::endl; // 24
    std::cout<<"E="<<sizeof(BirdOO)<<std::endl; // 32
}

much smaller class size version dropping all the virtual class stuff:

#include <iostream>
#include <vector>

using namespace std;

struct BaseFields {
};

vector<int> hp; // this will contain all the hit points, dynamically resizable, logic up to you
vector<float> flyPower; // this will contain all the fly powers, dynamically resizable, logic up to you

class BaseComponent {
public: // or protected
    int entityId{};
    int16_t componentId{};
    int8_t typeId{};
    int16_t hpIdx;
    int16_t flyPowerIdx;
protected:
    void damage() {
        hp[hpIdx] -= 1;
    };
    void addFlyPower(float power) {
        flyPower[hpIdx] += power;
    }
    void suicidalFly() {
        damage();
        addFlyPower(5);
    };
};
class HpOO : public BaseComponent {
public:
    using BaseComponent::damage;
};
class FlyableOO : public BaseComponent {
public:
    using BaseComponent::addFlyPower;
};
class BirdOO : public BaseComponent {
public:
    using BaseComponent::damage;
    using BaseComponent::addFlyPower;
    using BaseComponent::suicidalFly;
};

int main (){
    std::cout<<"Base="<<sizeof(BaseComponent)<<std::endl; // 12
    std::cout<<"C="<<sizeof(HpOO)<<std::endl; // 12
    std::cout<<"D="<<sizeof(FlyableOO)<<std::endl; // 12
    std::cout<<"E="<<sizeof(BirdOO)<<std::endl; // 12
    // accessing example
    constexpr int8_t BirdTypeId = 5;
    BaseComponent x;
    if( x.typeId == BirdTypeId ) {
        auto y = reinterpret_cast<BirdOO *>(&x);
        y->suicidalFly();
    }
}

this example assumes your derived classes do not have overlapping functionalities with diverging effects, if you have those you have to add virtual functions to your base class for an extra overhead of 12 bytes (or 8 if you pack the class).

and quite possibly the smallest version still maintaining the virtuals

#include <iostream>
#include <vector>

using namespace std;

struct BaseFields {
    int entityId{};
    int16_t componentId{};
    int8_t typeId{};
    int16_t hpIdx;
    int16_t flyPowerIdx;
};

#define PACKED [[gnu::packed]]

vector<int> hp; // this will contain all the hit points, dynamically resizable, logic up to you
vector<float> flyPower; // this will contain all the fly powers, dynamically resizable, logic up to you

vector<BaseFields> baseFields;

class PACKED BaseComponent {
public: // or protected
    int16_t baseFieldIdx{};
};
class PACKED HpOO : public virtual BaseComponent {
public:
    void damage() {
        hp[baseFields[baseFieldIdx].hpIdx] -= 1;
    }
};
class PACKED FlyableOO : public virtual BaseComponent {
public:
    void addFlyPower(float power) {
        flyPower[baseFields[baseFieldIdx].hpIdx] += power;
    }
};
class PACKED BirdOO : public virtual HpOO, public virtual FlyableOO {
public:
    void suicidalFly() {
        damage();
        addFlyPower(5);
    }
};

int main (){
    std::cout<<"Base="<<sizeof(BaseComponent)<<std::endl; // 2
    std::cout<<"C="<<sizeof(HpOO)<<std::endl; // 16 or 10
    std::cout<<"D="<<sizeof(FlyableOO)<<std::endl; // 16 or 10
    std::cout<<"E="<<sizeof(BirdOO)<<std::endl; // 24 or 18
}

the first number is for unpacked structure, second packed

You can also pack the hpIdx and flyPowerIdx into the entityId using the union trick:

union {
    int32_t entityId{};
    struct {
    int16_t hpIdx;
    int16_t flyPowerIdx;
    };
};

in the above example if not using packing and moving the whole BaseFields structure into the BaseComponent class the sizes remain the same.

END EDIT

Virtual inheritance just adds one pointer size to the class, plus alignment of the pointer (if needed). You can't get around that if you actually need a virtual class.

The question you should be asking yourself is whether you actually need it. Depending on your access methods to this data that might not be the case.

Considering you need virtual inheritance but all common methods that need to be callable from all your classes you can have a virtual base class and use a bit less space than your original design in the following way:

class Base{
    public: int id=0;
    virtual ~Base();
    // virtual void Function();

};
class B : public  Base{
    public: int fieldB=0;
    // void Function() override;
};
class C : public  B{
    public: int fieldC=0;
};
class D : public  B{
    public: int fieldD=0;
};
class E : public  C, public  D{

};

int main (){
    std::cout<<"Base="<<sizeof(Base)<<std::endl; //16
    std::cout<<"B="<<sizeof(B)<<std::endl; // 16
    std::cout<<"C="<<sizeof(C)<<std::endl; // 24
    std::cout<<"D="<<sizeof(D)<<std::endl; // 24
    std::cout<<"E="<<sizeof(E)<<std::endl; // 48
}

In the case that there are cache misses but the CPU still has power to process the results you can furter decrease the size by using compiler-specific instructions to make the data structure as small as possible (next example works in gcc):

#include<iostream>

class [[gnu::packed]] Base {
    public:
    int id=0;
    virtual ~Base();
    virtual void bFunction() { /* do nothing */ };
    virtual void cFunction() { /* do nothing */ }
};
class [[gnu::packed]] B : public Base{
    public: int fieldB=0;
    void bFunction() override { /* implementation */ }
};
class [[gnu::packed]] C : public B{
    public: int fieldC=0;
    void cFunction() override { /* implementation */ }
};
class [[gnu::packed]] D : public B{
    public: int fieldD=0;
};
class [[gnu::packed]] E : public C, public D{

};


int main (){
    std::cout<<"Base="<<sizeof(Base)<<std::endl; // 12
    std::cout<<"B="<<sizeof(B)<<std::endl; // 16
    std::cout<<"C="<<sizeof(C)<<std::endl; // 20
    std::cout<<"D="<<sizeof(D)<<std::endl; // 20
    std::cout<<"E="<<sizeof(E)<<std::endl; //40
}

saving an additional 8 bytes at the price of possibly some CPU overhead (but if memory is the issue might help).

Additionally if there is really a single function you are calling for each of your classes you should only have that as a single function which you override whenever necessary.

#include<iostream>

class [[gnu::packed]] Base {
public:
    virtual ~Base();
    virtual void specificFunction() { /* implementation for Base class */ };
    int id=0;
};

class [[gnu::packed]] B : public Base{
public:
    void specificFunction() override { /* implementation for B class */ }
    int fieldB=0;
};

class [[gnu::packed]] C : public B{
public:
    void specificFunction() override { /* implementation for C class */ }
    int fieldC=0;
};

class [[gnu::packed]] D : public B{
public:
    void specificFunction() override { /* implementation for D class */ }
    int fieldD=0;
};

class [[gnu::packed]] E : public C, public D{
    void specificFunction() override {
        // implementation for E class, example:
        C::specificFunction();
        D::specificFunction();
    }
};

This would also allow you to avoid having to figure out what class which object is before calling the appropriate function.

Furthermore, assuming your original virtual class inheritance idea is what works best for your application you could restructure your data so that it's more easily accessible for caching purposes while also decreasing the size of your classes and having your functions accessible at the same time:

#include <iostream>
#include <array>

using namespace std;

struct BaseFields {
    int id{0};
};

struct BFields {
    int fieldB;
};

struct CFields {
    int fieldB;
};

struct DFields {
    int fieldB;
};

array<BaseFields, 1024> baseData;
array<BaseFields, 1024> bData;
array<BaseFields, 1024> cData;
array<BaseFields, 1024> dData;

struct indexes {
    uint16_t baseIndex; // index where data for Base class is stored in baseData array
    uint16_t bIndex; // index where data for B class is stored in bData array
    uint16_t cIndex;
    uint16_t dIndex;
};

class Base{
    indexes data;
};
class B : public virtual Base{
    public: void bFunction(){
        //do something about "fieldB"
    }
};
class C : public virtual B{
    public: void cFunction(){
        //do something about "fieldC"
    }
};
class D : public virtual B{
};
class E : public virtual C, public virtual D{};

int main (){
    std::cout<<"Base="<<sizeof(Base)<<std::endl; // 8
    std::cout<<"B="<<sizeof(B)<<std::endl; // 16
    std::cout<<"C="<<sizeof(C)<<std::endl; // 16
    std::cout<<"D="<<sizeof(D)<<std::endl; // 16
    std::cout<<"E="<<sizeof(E)<<std::endl; // 24
}

Obviously this is just an example and it assumes you don't have more than 1024 objects at a point, you can increase that number but above 65536 you'd have to use a bigger int to store them, also below 256 you can use uint8_t to store the indexes.

Furthermore if one of the structures above adds very little overhead to it's parent you could reduce the number of arrays you use to store the data, if there's very little difference in the size of objects you can just store all the data in a single structure and have more localized memory accesses. That all depends on your application so I can't give more advice here other than to benchmark what works best for your case.

Have fun and enjoy C++.

Thank. I would not be able to call `e->Function()` though. Plainly remove `virtual` keyword would make be my class represent a wrong meaning. (`E` should has only 1 `B`, sorry if it is not clear in question.) — cppBeginner, Jun 10 '19 at 13:41
I also added an example preserving the virtual keyword to your classes but re-arranging the storage for the data. — xception, Jun 10 '19 at 14:24
I use 56 bytes per `E`, but yours = 24 (inside class) + 12 (outside class) = 48 per `E`. Why your solution happen to save some memory? Special optimization that compiler does on a class that declare no fields? Magic? — cppBeginner, Jun 11 '19 at 03:15
@cppBeginner because in your version each class has a member which uses up space, but the int you use (4 bytes) is smaller than a pointer on 64-bit platform (which is 8 bytes)... in most cases on a 64-bit platform every pointer is ideally aligned to 8 bytes, so the structures containing pointers (due to virtual) will also be aligned to 8 bytes ... so the aligment requirements would extend your structure from 8+4=12 bytes to 16... which leads to a cascading effect on your E due to having 2 structures inside. Hope that was clear enough — xception, Jun 11 '19 at 08:01
addendum: because my class includes all the members in the base class the extra alignment is no longer necessary for derived classes making the whole structure smaller, however this increases the base class... that's why I'm saying you should choose what best fits your usage and your data patterns, if most of your objects are E it might make sense to use a single index and leave gaps in your arrays or even declare everything of the datatype of e and leave gaps where the data is unnecessary — xception, Jun 11 '19 at 08:05
if I didn't answer your question I believe I could make it even smaller if I know the actual data layout of your structures, you can probably do part of it yourself, with the knowledge that alignment will be chosen as 8bytes on 64-bit platforms for anything that's greater than 4 bytes, 4 bytes for anything bigger than 2, 2 bytes for what uses 2 bytes, and 1 for smaller. You can also use bitfields in some scenarios but they might hurt performance rather than improve it! Benchmark your changes to see which works best. — xception, Jun 11 '19 at 08:20
You answered the question in my comment clearly, thank. I am still not satisfied with your main solution though. I still waste a lot memory from virtual inheritance. (I have a complex deep tree of inheritance) Although your packing could help, the underlying problem still severe. I just started to seek a way to not use virtual inheritance instead. But, hey thank! — cppBeginner, Jun 11 '19 at 08:27
Thanks for the clarification, I can also make an example that avoids virtuals completely if that would help, but the benefits depend greatly on what you want to do. — xception, Jun 11 '19 at 08:40
I should really provide greater detail before you invest more time. I will ping you after I finish it. XD — cppBeginner, Jun 11 '19 at 08:48
Let us [continue this discussion in chat](https://chat.stackoverflow.com/rooms/194741/discussion-between-xception-and-cppbeginner). — xception, Jun 11 '19 at 09:12

score 1 · Answer 2 · edited Jun 11 '19 at 08:38

You can avoid virtual inheritance by using the following technique: make all classes except leaf classes fully abstract (no data members). All data access is through virtual getters.

class A {
 virtual int & a() = 0; // private!
 // methods that access a
};

class B : public A {
 virtual int & c() = 0; // private!
 // methods that access b
};

class C: public A {
 virtual int & c() = 0; // private!
 // methods that access c
};

class D: public B, public C {
 int & a() override { return a_; }
 int & b() override { return b_; } 
 int & c() override { return c_; }
 int a_, b_, c_; 
};

This way, you can non-vir inherit a class several times without duplicating any data members (because there are none in the first place).

In the example D has A twice, but this is not important since A is virtually empty.

With a typical implementation you should get a vptr per most derived class plus one vptr for each base class except the first at each level in your hierarchy.

Of course you now have a virtual call overhead for each member access, but nothing comes for free.

If this overhead is too much for you, and you still need polymorphism, you will probably need to implement it in a way that doesn't involve the C++ mechanism of virtual functions at all. There are quite a few ways of doing so but of course each comes with its own special drawbacks so it is hard to recommend one.

Thank. It is a painful answer. It looks like a language limitation. I never use `E` polymorphism-ly (i.e. no cast), so I don't expect to pay for it. About the drawbacks, I don't mind the mess if it appears only in the internal library but not on user's class. Ex. I already do "assemblage" like https://gamedev.stackexchange.com/questions/58693/grouping-entities-of-the-same-component-set-into-linear-memory. My internal library is totally a mess, but 1. interface is clean and 2. user's program is faster. Worth the effort. — cppBeginner, Jun 11 '19 at 04:43
@cppBeginner it's not a language limitation. It's a nature limitation. Nothing comes for free. Write it in assembly, you will encounter the same problem. If you don't use a class pilymorphically, just don't use virtual functions. Use CRTP or a similar idiom. — n. m. could be an AI, Jun 11 '19 at 05:48

reduce size of object (wasted) in Multi virtual inheritance

Edit

Edit 3

2 Answers2