Best way to store std::vector of derived class in a host parent class

Question

I want to store a std::vector<> containing objects which have a common base class, within a host class. The host class should remain copiable since it is stored inside a std::vector<> of it's owner class.

C++ offers multiple ways of doing that, but I want to know the best practice.

Here is an example using std::shared_ptr<>:

class Base{};
class Derivative1: public Base{};
class Derivative2: public Base{};

class Host{
public: std::vector<std::shared_ptr<Base>> _derivativeList_{};
};

class Owner{
public: std::vector<Host> _hostList_;
};

int main(int argc, char** argv){
  Owner o;
  o._hostList_.resize(10);
  
  Host& h = o._hostList_[0];
  h._derivativeList_.emplace_back(std::make_shared<Derivative1>());
  // h._derivativeList_.resize(10, std::make_shared<Derivative1>()); // all elements share the same pointer, but I don't want that. 
}

Here the main drawback for me is that in order to claim a lot of elements in _derivativeList_ I need to perform emplace_back() for every single element. This takes a lot more time than a simple resize(N) which I can't use with std::shared_ptr<> since it will create the same pointer instance for every slot.

I thought about using std::unique_ptr<> instead, but this is not viable since it makes the Host class non copiable (a feature requested by std::vector).

Otherwise, I could use std::variant<Derived1, Derived2> which can do what I want. However I would need to declare every possible instance of the derived class...

Any thought/advice about this?

`I need to perform emplace_back() for every single element. This takes a lot more time than a simple resize(N)` Why do you think it takes a lot more time? Don't you think `resize` is just a loop with copy operator? — KamilCuk, May 16 '22 at 22:10
You can use a `shared_ptr` default constructed or initialized with the `nullptr`. For moving a `vector` see https://stackoverflow.com/questions/18282204/proper-way-of-transferring-ownership-of-a-stdvector-stdunique-ptr-int-t — Sebastian, May 16 '22 at 22:16
Should the host class, when being copied, do a deep or shallow copy of the derivatives? — Sebastian, May 16 '22 at 22:26
`emplace_back` is *not* slower than `resize` because the latter is just copying elements and inserting at end. `vector` is optimized to do frequent `emplace_back`s, so using it should be fine. Also, have you considered using its `reserve(size_t capacity)`? — Oasin, May 17 '22 at 01:23
Thanks a lot for your answers! `emplace_back()` is slower than using a simple `resize()`. May be this could be fixed by reserving the slots in advance and still use emplace_back(). I don't explicitly want to make copy of the Host class, but the vector it contains needs to be copyiable. — nadrino, May 17 '22 at 07:45
*I don't explicitly want to make copy of the Host class, but the vector it contains needs to be copyiable.* Why? And what should happen to the pointed to Derivatives in the vector member of the Host class, when you are copying? Should each be copied individually or should the vector be copied and point to the old Derivatives? Or should both host copies point to the same vector? — Sebastian, May 17 '22 at 07:49
If I'd like to make a copy of Owner, I'd expect the Derived objects to be copied as well, i.e. with another stored pointer. — nadrino, May 17 '22 at 08:54
"If I'd like to make a copy of Owner, I'd expect the Derived objects to be copied as well" Not gonna happen. If you have a pointer (of anu kind) to `Base`, you cannot copy whatever object it points to, so `vector` cannot either. — n. m. could be an AI, May 19 '22 at 18:05

score 4 · Accepted Answer · answered May 19 '22 at 17:46

tldr: Use a variant or type erasure, depending on context.

What you are asking for in C++ would be described roughly as a value type or a type with value semantics. You want a type that is copyable, and copying just "does the right thing" (copies do not share ownership). But at the same time you want polymorphism. You want to hold a variety of types that satisfy the same interface. So... a polymorphic value type.

Value types are easier to work with, so they will make a more pleasant interface. But, they may actually perform worse, and they are more complex to implement. Therefore, as with everything, discretion and judgment come into play. But we can still talk about the "best practice" for implementing them.

Let's add an interface method so we can illustrate some of the relative merits below:

struct Base {
  virtual ~Base() = default;
  virtual auto name() const -> std::string = 0;
};

struct Derivative1: Base {
  auto name() const -> std::string override {
    return "Derivative1";
  }
};

struct Derivative2: Base {
  auto name() const -> std::string override {
    return "Derivative2";
  }
};

There are two common approaches: variants and type erasure. These are the best options we have in C++.

Variants

As you imply, variants are the best option when the set of types is finite and closed. Other developers are not expected to add to the set with their own types.

using BaseLike = std::variant<Derivative1, Derivative2>;

struct Host {
  std::vector<BaseLike> derivativeList;
};

There's a downside to using the variant directly: BaseLike doesn't act like a Base. You can copy it, but it doesn't implement the interface. Any use of it requires visitation.

So you would wrap it with a small wrapper:

class BaseLike: public Base {
public:
  BaseLike(Derivative1&& d1) : data(std::move(d1)) {}
  BaseLike(Derivative2&& d2) : data(std::move(d2)) {}

  auto name() const -> std::string override {
    return std::visit([](auto&& d) { return d.name(); }, data);
  }

private:
  std::variant<Derivative1, Derivative2> data;
};

struct Host {
  std::vector<BaseLike> derivativeList;
};

Now you have a list in which you can put both Derivative1 and Derivative2 and treat a reference to an element as you would any Base&.

What's interesting now is that Base is not providing much value. By virtue of the abstract method, you know that all derived classes correctly implement it. However, in this scenario, we know all the derived classes, and if they fail to implement the method, the visitation will fail to compile. So, Base is actually not providing any value.

struct Derivative1 {
  auto name() const -> std::string {
    return "Derivative1";
  }
};

struct Derivative2 {
  auto name() const -> std::string {
    return "Derivative2";
  }
};

If we need to talk about the interface we can do so by defining a concept:

template <typename T>
concept base_like = std::copyable<T> && requires(const T& t) {
  { t.name() } -> std::same_as<std::string>;
};

static_assert(base_like<Derivative1>);
static_assert(base_like<Derivative2>);
static_assert(base_like<BaseLike>);

In the end, this option looks like: https://godbolt.org/z/7YW9fPv6Y

Type Erasure

Suppose instead we have an open set of types.

The classical and simplest approach is to traffic in pointers or references to a common base class. If you also want ownership, put it in a unique_ptr. (shared_ptr is not a good fit.) Then, you have to implement copy operations, so put the unique_ptr inside a wrapper type and define copy operations. The classical approach is to define a method as part of the base class interface clone() which every derived class overrides to copy itself. The unique_ptr wrapper can call that method when it needs to copy.

That's a valid approach, although it has some tradeoffs. Requiring a base class is intrusive, and may be painful if you simultaneously want to satisfy multiple interfaces. std::vector<T> and std::set<T> do not share a common base class but both are iterable. Additionally, the clone() method is pure boilerplate.

Type erasure takes this one step more and removes the need for a common base class.

In this approach, you still define a base class, but for you, not your user:

struct Base {
  virtual ~Base() = default;
  virtual auto clone() const -> std::unique_ptr<Base> = 0;
  virtual auto name() const -> std::string = 0;
};

And you define an implementation that acts as a type-specific delegator. Again, this is for you, not your user:

template <typename T>
struct Impl: Base {
  T t;
  Impl(T &&t) : t(std::move(t)) {}
  auto clone() const -> std::unique_ptr<Base> override {
    return std::make_unique<Impl>(*this);
  }
  auto name() const -> std::string override {
    return t.name();
  }
};

And then you can define the type-erased type that the user interacts with:

class BaseLike
{
public:
  template <typename B>
  BaseLike(B &&b)
    requires((!std::is_same_v<std::decay_t<B>, BaseLike>) &&
             base_like<std::decay_t<B>>)
  : base(std::make_unique<detail::Impl<std::decay_t<B>>>(std::move(b))) {}

  BaseLike(const BaseLike& other) : base(other.base->clone()) {}

  BaseLike& operator=(const BaseLike& other) {
    if (this != &other) {
      base = other.base->clone();
    }
    return *this;
  }

  BaseLike(BaseLike&&) = default;

  BaseLike& operator=(BaseLike&&) = default;

  auto name() const -> std::string {
    return base->name();
  }

private:
  std::unique_ptr<Base> base;
};

In the end, this option looks like: https://godbolt.org/z/P3zT9nb5o

Best way to store std::vector of derived class in a host parent class

1 Answers1

Variants

Type Erasure

Linked