abstract classes in std containers

Question

Very often, when I program, I use polymorphism because it naturally models the objects that I need. On the other hand I very often use standard containers to store these objects, and I tend to avoid pointers because this either requires me to free up the objects instead of popping them off the stack or requires me to know for sure the objects will stay on the stack while I use the pointer. Of course there are all kinds of pointer-container objects that sort of do this task for you, but in my experience they are also not ideal or even annoying. That is; if such a simple solution existed, it would have been in the c++ language, right ? ;)

So lets have a classic example:

#include <iostream>
#include <vector>

struct foo {};
struct goo : public foo {};
struct moo : public foo {};

int main() {
    std::vector<foo> foos;
    foos.push_back(moo());
    foos.push_back(goo());
    foos.push_back(goo());
    foos.push_back(moo());

    return 0;
}

See: http://ideone.com/aEVoSi . This works fine, and if the objects have different sizeof's the compiler may apply slicing. However, due to the the fact that c++ knows no instanceof like Java, and to the best of my knowledge no adequate alternative exists, one cannot access the properties of the inherited classes after fetching them as a foo from the vector.

Hence one would use virtual function, however this disallows one to allocate a foo, and hence one is not permitted to use them in a vector. See Why can't we declare a std::vector<AbstractClass>? .

For example I may want to be able to print both subclasses, simple feature, right?

#include <iostream>
#include <vector>

struct foo {
        virtual void print() =0;
        virtual ~foo() {}
};

struct goo : public foo {
    int a;
    void print() { std::cout << "goo"; }
};

struct moo : public foo {
    int a,b;
    void print() { std::cout << "moo"; }
};

int main() {
    std::vector<foo> foos;
    foos.push_back(moo());
    foos.push_back(goo());
    foos.push_back(goo());
    foos.push_back(moo());

    for(foo& f : foos) {
        f.print();
    }
    return 0;
}

Source: http://ideone.com/I4rYn9

This is a simple addition, as a designer I would never think of wanting this behavior in foresight. I would already be so thrilled by the fact that c++ was able to slice my objects and hence store objects of different sizes in one vector. Unfortunately it cannot do so anymore when the base class is abstract, as stated here: Why can't we declare a std::vector<AbstractClass>?

The general good solution seems to be to use pointers. But this (1) forces me to do memory management and (2) I'd need to change interfaces and recode a lot of things. For instance, consider that I first had some class interface returning a std::vector<foo>, now it returns a std::vector<foo *>, so I need to check and change all the calls of foo; which is annoying, or even impossible if I am writing a library.

So basically, imho, that is a small feature addition with big code consequences.

My question is w.r.t. coding standards. How can I prevent that these annoyances occur? Should I always use pointers, and do all my memory management? Should I always assume a class might become abstract along the way?

EDIT, ANSWER: Based on the answer of 40two I made this sniplet:

#include <iostream>
#include <vector>
#include <memory>

struct foo {
    virtual void print() =0;
};

struct goo : public foo {
    int a;
    void print() { std::cout << "goo"; }
};

struct moo : public foo {
    int a,b;
    void print() { std::cout << "moo"; }
};
typedef std::unique_ptr<foo> foo_ptr;
int main() {
    std::vector<std::unique_ptr<foo> > foos;
    foos.push_back(foo_ptr(new moo));
    foos.push_back(foo_ptr(new goo));
    foos.push_back(foo_ptr(new goo));
    foos.push_back(foo_ptr(new moo));

    for(auto it = foos.begin(); it!=foos.end(); ++it) {
        it->get()->print();
    }
    return 0;
}

Source: http://ideone.com/ym4SY2

your second example yields undefined behavior since abstract class has no virtual destructor — 4pie0, Jun 09 '14 at 17:39
If possible, could you fork-amend the example with he least amount of extra code such that the behavior is defined? Because I think that is not the main point of my code :) **I see you did so, thanx!** — Herbert, Jun 09 '14 at 17:42
You could make the inheritance be private to the `foo` (that is, have a (smart) pointer in `foo`, pointing to a `foo_impl` object, that may be subclassed). Now externally, `foo` appears to be a value object, but internally it may forward to different implementations at runtime with different run-time data. — Mankarse, Jun 09 '14 at 17:49
"I would already be so thrilled by the fact that c++ was able to slice my objects and hence store objects of different sizes in one vector." You seem to have completely misunderstood what slicing is. Slicing is a (bad) name for a conversion. It took your `moo`, and made `foo` from it _that is no longer a `moo`_. "Slicing" is almost always a _bad_ thing. The vector is _not_ storing objects of different sizes, it only stores `foo` objects, which are all the same size. — Mooing Duck, Jun 09 '14 at 17:51
@MooingDuck If I have **class foo : class bar**, where foo has 2 ints and bar has 1, then one would need some technique to store both foo's and bar's in the same container or array, right? How is that called? — Herbert, Jun 09 '14 at 17:55
@bits_international There is no undefined behavior in either example: the vector stores `foo` objects, not pointers to objects derived from `foo`. — Casey, Jun 09 '14 at 17:57
@Herbert: There is no way to store them in the same container or array (assuming you don't know the most derived type at compile time), other than pointers. There could be a container that managed the pointers for you however, making them invisible... — Mooing Duck, Jun 09 '14 at 17:57
@Casey: It doesn't even compile, since `vector` can't instantiate with an abstract value type. The linked code says "error: cannot allocate an object of abstract type ‘foo’" — Mooing Duck, Jun 09 '14 at 17:58
Actually, now that I think about it, it's very easy to pass pointer or reference to a base type to the container to copy from, which means it would require type-trait specializations in order for a generic container to determine the size of the most-derived type in order to know how much memory to allocate. And nobody wants to have to "register" all their derived types with a container, that's inconvenient, error-prone, and has significant performance impacts at runtime. — Mooing Duck, Jun 09 '14 at 18:06
@MooingDuck It does not compile due to the virtual method, not due to slicing. regarding your comment before, does that mean that fields from a derived class are just "not in memory anymore" after a cast? That would explain why abstract classes may not be allocated; since the virtual methods may access "sliced bytes". — Herbert, Jun 09 '14 at 18:50
@Herbert: No, it's worse than that. The one in the vector is a `foo` whos members are a copy of the `foo` members of the `moo` used to construct it, but the `foo` in the vector does not refer to the extra members in any way, shape, or form, because `vector` contains `foo` objects, not `moo` objects. If you call a virtual function, the `moo::` versions are used. If there is non `moo::` version, it can't even be compiled. — Mooing Duck, Jun 09 '14 at 22:38
"Slicing" works like this: `std::vector v; v.push_back(1000000000);`. What happens? The vector holds one short, not one integer. — Mooing Duck, Jun 09 '14 at 22:40
An alternative to a vector of pointers might be [Sean Parent's concept-based polymorphism](http://sean-parent.stlab.cc/papers-and-presentations/#value-semantics-and-concept-based-polymorphism). — Chris Drew, Aug 02 '18 at 21:54

101010 · Accepted Answer · 2014-06-09T18:23:22.830

12

One solution if you compiler supports C++11 features, would be to use std::vector<std::shared_ptr<foo>> or std::vector<std::unique_ptr<foo>>instead of raw pointers like the example below:

#include <iostream>
#include <memory>
#include <vector>

struct foo {
    virtual void print() = 0;
};

struct goo : public foo {
    int a;
    void print() { std::cout << "goo"; }
};

struct moo : public foo {
    int a,b;
    void print() { std::cout << "moo"; }
};

auto main() -> int {
    std::vector<std::shared_ptr<foo>> v{std::make_shared<goo>(), std::make_shared<moo>()};
    for(auto i : v) { 
        i->print();
        std::cout << std::endl;
    }
    return 0;
}

or with std::vector<std::unique_ptr<foo>>:

auto main() -> int {
    std::vector<std::unique_ptr<foo>> v;
    v.push_back(std::move(std::unique_ptr<goo>(new goo)));
    v.push_back(std::move(std::unique_ptr<moo>(new moo)));
    for(auto it(v.begin()), ite(v.end()); it != ite; ++it) { 
        (*it)->print();
        std::cout << std::endl;
    }
    return 0;
}

Thus, you wouldn't have to worry about memory deallocation.

edited Jun 09 '14 at 18:23

answered Jun 09 '14 at 17:48

101010

41,839
11
94
168

Fair enough, but still it changes the interface :) Although I'm afraid I'm just going to have to live with that, I'd still like an answer which helps me **prevent** the problem. – Herbert Jun 09 '14 at 17:50
@MooingDuck You mean, you cannot prevent this problem? How do library developers do it? – Herbert Jun 09 '14 at 17:56
@Herbert: They use `std::vector>` – Mooing Duck Jun 09 '14 at 18:00
@MooingDuck So they do that in any case, just in case the class becomes abstract? – Herbert Jun 09 '14 at 18:54
@Herbert: We do that for any container that might need to store a derived class: virtual, abstract, or neither. If you don't do that, and someone tries to store a derived class, that's a bug. – Mooing Duck Jun 09 '14 at 19:07
@MooingDuck But my point was that you can never know for sure: simple feature addition (print) -> huge impact (class becomes abstract). Unless you're extremely intelligent and foreseeing or know some decision criterion for 'might become virtual' that I am not aware of, you'll always need to assume it might happen. I mean, am I plain *inexperienced* if I would be clueless about whether a class might become abstract or not when designing it? – Herbert Jun 10 '14 at 08:43
1

@Herbert: Technically, you're 100% right. In practice, making a class that derives from a non-virtual base has problems, and is actually quite rare. If it makes sense to publicly inherit from it, it's probably a good idea for the base class to have a virtual destructor. If it's reasonable to assume it will probably be inherited from, use `std::vector>`. Yes, sometimes people guess wrong and conversions have to happen, but it's rare. (Also, I think you're slightly misusing the word "abstract", but you're close enough that it's not causing a problem here.) – Mooing Duck Jun 10 '14 at 16:28
`v.push_back(std::move(std::unique_ptr(new goo)));` -> `v.emplace_back(std::make_unique());` if you have C++14 – Caleth Aug 03 '18 at 14:32

4pie0 · Answer 2 · 2014-06-09T18:11:55.283

5

You can use raw pointers and handle memory correctly

std::vector< AbstractBase*>

or you can use smart pointers, i.e std::shared_ptr (a smart pointer that retains shared ownership of an object through a pointer) or std::unique_ptr(smart pointer that retains sole ownership of an object through a pointer and destroys that object when the unique_ptr goes out of scope) and let the library do memory management for you. So you end up then with something like

std::vector< std::shared_ptr<AbstractBase>>

or

std::vector< std::unique_ptr<AbstractBase>>

http://en.cppreference.com/w/cpp/memory/shared_ptr http://en.cppreference.com/w/cpp/memory/unique_ptr

edited Jun 09 '14 at 18:11

answered Jun 09 '14 at 17:49

4pie0

29,204
9
82
118

2

I'd even go as far as to say "prefer [value_ptr](https://bitbucket.org/martinhofernandes/wheels/src/a3365a24524e4e7c05754689bf44fae160e5ed83/include/wheels/smart_ptr/value_ptr.h%2B%2B?at=default) over both `unique_ptr` and `shared_ptr`", though this is a point of contention in the C++ community. – Mankarse Jun 09 '14 at 18:03
@Mankarse: Personally, I'd agree, but that's much more subjective and nonstandard. However, we all agree that `shared_ptr` is the wrong tool 99% of the time. – Mooing Duck Jun 09 '14 at 18:08
@MooingDuck `shared_ptr` only gets so much attention because it tries to be the "do everything" smart pointer. I think it is mostly useful as an evangelical tool to convince people it is easy to not use `new` and `delete`. – Tim Seguine Jun 09 '14 at 18:38

score 2 · Answer 3 · answered Jun 09 '14 at 18:06

I would recommend the use of a shared_ptr ie:

vector<shared_ptr<foo> >

instead of the raw pointer. That will take care of the vast majority of your memory management problems.

The second issue will still remain as you would need to redesign your interface in some areas. But there is nothing you can do about that as you need pointers when working with abstract base classes. You can't just access foo as a direct reference if foo is abstract. If you can, design your interface such that it hides these details.

Sorry this is probably not the answer you are looking for but this is my best recommendation.

score 2 · Answer 4 · answered Jun 09 '14 at 18:11

You might wrap the polymorphic relationship of your classes and use a smart pointer:

#include <iostream>
#include <memory>
#include <vector>

class Base
{
    protected:
    struct Implementation
    {
        virtual ~Implementation() {}
        virtual void print() const = 0;
    };

    Implementation& self() const { return *m_self; }

    protected:
    Base(std::shared_ptr<Implementation> self)
    :   m_self(self)
    {}

    public:
    void print() const { self().print(); }

    private:
    std::shared_ptr<Implementation> m_self;
};

class Foo : public Base
{
    protected:
    struct Implementation : Base::Implementation
    {
        virtual void print() const { std::cout << "Foo\n"; }
    };

    Implementation& self() const { return static_cast<Implementation&>(Base::self()); }

    public:
    Foo() : Base(std::make_shared<Implementation>()) {}
};

class Goo : public Base
{
    protected:
    struct Implementation : Base::Implementation
    {
        virtual void print() const { std::cout << "Goo\n"; }
    };

    Implementation& self() const { return static_cast<Implementation&>(Base::self()); }

    public:
    Goo() : Base(std::make_shared<Implementation>()) {}
};

int main() {
    std::vector<Base> v = { Foo(), Goo() };
    for(const auto& x: v)
        x.print();
}

I like the idea, but I don't think it is appropriate for my goals, since like I said it is verbose and contains a lot of boilerplate code. Thank you though! — Herbert, Jun 10 '14 at 09:40

Wormer · Answer 5 · 2018-08-03T16:44:12.523

How about writing a wrapper around foo that incapsulates foo* and implicitly converts to foo&?

It uses a copy semantic that calls the underlying clone on the stored object to make a deep copy. This is at least not worse than the original intent to store everything by value. If you end up storing everything as a pointer to abstract base anyway then this has the same level of indirection as a unique_ptr but is copyable (whereas unique_ptr is not). On the other hand this is less overhead than shared_ptr.

Add clone() to the abstract hierarchy:

struct foo {
    virtual void print() const = 0;

    virtual ~foo() {};
    virtual foo* clone() = 0;
};

struct goo : public foo {
    int a;
    void print() const { std::cout << "goo" << std::endl; }
    foo* clone() { return new goo(*this); }
};

struct moo : public foo {
    int a,b;
    void print() const { std::cout << "moo" << std::endl; }
    foo* clone() { return new moo(*this); }
};

Define foo_w wrapper around foo, see copy-and-swap idiom.

struct foo_w {
    foo_w(foo *f = nullptr) : fp(f) {}
    ~foo_w() { delete fp; }

    foo_w(const foo_w& that) : fp(that.fp->clone()) {}
    foo_w(foo_w&& that) : foo_w() { swap(*this, that); }

    foo_w& operator=(foo_w rhs) {
       swap(*this, rhs);
       return *this;
    }

    friend void swap(foo_w& f, foo_w& s) {
       using std::swap;
       swap(f.fp, s.fp);
    }

    operator foo&() { return *fp; } 
    operator const foo&() const { return *fp; } 

    foo& get() { return *fp; }
    const foo& get() const { return *fp; }

    // if we rewrite interface functions here
    // calls to get() could be eliminated (see below)
    // void print() { fp->print(); };

private:
    foo *fp;
};

The usage is as follows:

#include <iostream>
#include <memory>
#include <vector>

// class definitions here...

int main() {
    std::vector<foo_w> foos;
    foos.emplace_back(new moo);
    foos.emplace_back(new goo);
    foos.emplace_back(new goo);
    foos.emplace_back(new moo);

    // variant 1: do it through a getter:
    for(auto it = foos.begin(); it!=foos.end(); ++it) {
        it->get().print();
        // the presence of a proxy is almost hidden
        // if we redefine interface in foo_w
        // it->print();
    }

    // variant 2: use it through reference to foo
    for(auto it = foos.begin(); it!=foos.end(); ++it) {
        foo& fr = *it;
        fr.print();
    }

    // variant 3: looks really nice with range-for
    for(foo& fr : foos)
        fr.print();

    return 0;
}

The wrapper behavior is really up to what suits your needs. Probably if you're OK with unique_ptr being not copyable that one is a better way to go, for me it was critical so I ended up with this. Also have a look at std::reference_wrapper to store reference-like objects in the container.

abstract classes in std containers

5 Answers5

Linked

Related