Efficient container for Object and Subclass

Question

What I need

I am in need of a container that stores instances of objects A and derived subclasses B. Specifically, I need to implement the following tasks:

Efficient addition of new instances (append is sufficient).
Efficient removal of instances at random places in the container (no lookup by index is required; removal of objects will happen while iterating over the container).
Efficient iteration over elements in the container. This is the most important part, as iteration is needed more often than manipulation.

Example

The header file could look like something along the lines below:

int globalB = 5;

// Base class
class A {
public:
    A(a) : a(a);
    ~A();

    int a;

    virtual int get_b() {
        return globalB;
    }
};

// Derived class
class B : public A {
public:
    B(a, b) : A(a), b(b);
    ~B();

    int a;
    int b;

    int get_b() {
        return b;
    }
};

// Container
class Container {
public:
    Container();
    ~Container();

    // adds an A element
    void add_element(a);

    // adds a B element
    void add_element(a, b);

    // removes all elements with (elem.a == 0)
    void remove_a0_elements();

    // Iterator (I will still have to figure out how this is done properly)
    struct Iterator { /* ... */ };
};


static int example_usage() {

    auto container = Container();
    for (int a=1; i<=100; i++) {
        container.add_element(a);
        container.add_element(a, a);
    }

    int sum = 0;
    for (auto &elem : container) {
        sum += elem.get_b();
    }

    return sum;
}

Note that unlike suggested by the example, the elements will not be added in consecutive operations but rather at random times in the program. Of course any structure of the container with which I achieve the tasks in the example is fine, too (e.g. adding an elment by handing it over rather than constructing it in-place). If there is some memory overhead, that would not be of major concern, as all objects together are not very large.

My thoughts so far

I have thought of using a vector of std::unique_ptr for the task, as suggested here. However, I am afraid that the memory would be scattered that way, considerably reducing the performance of the iterations (see here). Another thought was to let Container wrap two vectors - of A and B, respectively - but then I would not know how to build the iterator. Furthermore, this would make it difficult to use further subclasses (I will need it to work for at least two pairs of base class and subclass).

Questions

Is there any standard container that can do what I need?
If not, what would be an elegant way to implement a container as needed?
Is there a way to 'reserve' a chunk of memory to construct the elements of Container in without knowing their size? Then I could wrap a vector of pointers and circumvent the issue of scattered memory.

@RichardCritten Would that not cause problems due to scattered memory? (See question) — Samufi, May 15 '21 at 09:40
I'm not 100% sure on this, but I think I got an answer for the part about reserving memory. You could reserve a rather large chunk of memory, and then construct an element in-place using `placement-new`: https://en.cppreference.com/w/cpp/language/new — lionkor, May 15 '21 at 09:43
Considering the requirements I would rather recommend `std::deque` instead of `std::vector`. Also use `emplace` functions instead of `push`. — Some programmer dude, May 15 '21 at 09:43
@RichardCritten That would cause quite the horrible cache-locality, would it not? — lionkor, May 15 '21 at 09:43
Regarding performance - get it working - measure - optimize. — Richard Critten, May 15 '21 at 09:46

score 1 · Answer 1 · answered May 16 '21 at 19:43

The choices of container type and of element storage are actually orthogonal here: even if the elements are allocated separately, std::vector<std::unique_ptr<A>> can’t support efficient deletion and iteration simultaneously. You can of course reset an element, but then you must spend time skipping it every iteration.

It’s possible to augment the data structure with a free list to allow empty slots to be reused, and with counts of empty slots to allow them to be skipped efficiently during iteration. The result is the “colony” data structure (likely to be included in C++23 as std::hive). Note that it doesn’t append: new elements reuse the space of deleted ones in an unspecified order.

The element type remains to be chosen: you can use std::variant<A,B>, a similar custom type that knows that B derives from A (so as to provide A& for clients more easily), or you can use the separate containers approach (with its own effect on ordering).

Efficient container for Object and Subclass

1 Answers1