Improve storage capacity / performance of std::vector

Question

I am building a modelling software I had a few questions about how to get the best performance ?

1) Should I use std::vector<class> or std::vector<class*> ? My class is quite complicated / big , and I think using the second option is better , as since std::vector tries to allocate memory contiguously and there might not be a contiguous block of memory to store a million class, but when I just store pointers, the class does not have to be stored contiguously only the pointers have to stored and the computer might have space to do this. Is this reasoning correct?

2) As I said I will have millions of class, (for proper simulation I will need > billion of the class ) is inheritance a smart thing to use here ? For my simulation , there are multiple different types which inherits from the same base class,

class A - class B 
        - class C
        - class D

Should I avoid inheritance as I keep hearing that there is a performance penalty for using inheritance ?

3) Also how do I store all these different class in a std::vector ? Can a std::vector<base_class * > or std::vector<base_class> store class B , class C , class D which all inherit from the base class ?

4) In the previous version of the program , I used multi threading by making the different process handle different sections of the std::vector , is there a better way to do the threading ?

5) Should I use smart pointers ? Since I have so many objects , will they degrade performance ?

I am in the planning stage and any help is greatly appreciated.

you may consider [std::deque](http://en.cppreference.com/w/cpp/container/deque) which also has constant access time but allocates data in blocks which then never reallocated. — user3159253, Feb 26 '16 at 02:04
@user3159253 I will look into std::deque , but quick question does it allow me to iterate through the items as quickly as std::vector does ? — nnrales, Feb 26 '16 at 02:06
if you wish to have a container of mixed types, then pointers unavoidable, probably, unique_ptr's. Handling differrent blocks of data completely independently in parallel (think of sharding) is a smart idea. Regarding performance of deque: yes access time is O(1) and only slightly worse than that of vector (it has to calc the position of the required block first). And yes, do performance tests. — user3159253, Feb 26 '16 at 02:12
performance penalties for inheritance can be completely different. If you use `virtual` methods a lot, then, yes, it can degrade the performance comparing to non-virtual case. From other hand inlining can help a lot on higher optimization levels. But, again, check every idea with a test, depending on compiler and target platform results can be completely differrent — user3159253, Feb 26 '16 at 02:24

score 7 · Accepted Answer · edited May 23 '17 at 10:33

I deal with problems like this every day in a professional setting (I'm a C++ programmer by trade, dealing with big-data sets). As such what I'm about to say here is as much personal-advice as it is an answer. I won't go all out on the simple parts:

1 - Yes store pointers, it will be much faster than reallocation and move times than the full class-object.

2 - Yes, use inheritance if the objects have information in relation, I imagine in this case they most likely do as your considering it. If they don't, why would you store them together?

3 - Store them all using smart pointers to the base-class (the parent object, thus you can add a single virtual "get_type" function to return and enumeration, and convert to a child when you need to. This will save the overhead of providing multiple virtual-methods if you don't need child-data often.

4 - Arguable, but threading separate parts of a larger array is the simpler approach (and when your dealing with huge complexity of data, simpler is better.

Everyone knows that debugging is twice as hard as writing a program in the first place. So if you're as clever as you can be when you write it, how will you ever debug it? ~ Brian Kernighan

5 - There will be some small penalty for using smart pointers ( As explained in this question, however in my opinion that penalty (especially with the unique_ptr) is so small compared to the ease-of-use and loss of complexity, it's definitely worth it

And putting it all together:

class Abstract_Parent;
std::vector<std::unique_ptr<Abstract_Parent>> Data;
enum ChildType {Child_1 = 0, Child_2 = 1};

class Abstract_Parent
{
    public:
    virtual ChildType GetType() = 0;
}   
class Child_One
{
    public:
    virtual ChildType GetType() { return Child_1; }
}   
class Child_Two
{
    public:
    virtual ChildType GetType() { return Child_2; }
}   
void Some_Function()
{
    //this is how to insert a child-object
    std::unique_ptr<Abstract_Parent> Push_me_Back(new Child_One());
    Data.Push_Back(std::move(Push_me_Back));

    if(Data[0]->GetType() == Child_1) 
    {
        Child_1 *Temp_Ptr = dynamic_cast<Child_One*> Data[0];
        Temp_Ptr->Do_Something_Specific();
    }
}

I guess the real reason to use pointers instead of objects would be polymorphism. Moving and reallocation would not be the problem if you insert/remove objects only in the back of vector. — Nikolay K, Feb 26 '16 at 02:52
@NikolayKondratyev, I must disagree, even inserting objects at the back of the vector, you will reach it's reserved memory size. When you do the memory allocated for a larger one and data will be moved. If your inserting millions of elements this becomes very-costly. Secondly many complex-algorithms will require the data to be sorted, during std::sort swap operations must take place. All of these are going to cost more if the data is larger than a Pointer. — John Bargman, Feb 26 '16 at 02:55
Of course, if you know the maximum number of objects before you begin, this can be mitigated - to a point. — John Bargman, Feb 26 '16 at 02:59
@JohnBargman Thank you for your answer. Is there any books/ resource which you found helpful while writing high performance , multi threaded code dealing with many many objects ? — nnrales, Feb 26 '16 at 17:05
@nnrales, I wish I could - However I've learned most of my design patterns from working in the industry. However I have heard really good things about "Effective Modern C++ by Scott Meyers". My personal recommendation would be to watch some of the cpp-con videos by the creators of c++, as well as reading some of the articles on "modern C++" such as this - https://msdn.microsoft.com/en-GB/library/hh279654.aspx — John Bargman, Feb 27 '16 at 01:21

knivil · Answer 2 · 2016-02-29T21:39:59.887

1.) That depends on your use case. You will use a pointer if you want to access object through a base class pointer. On the other side you lose the advantage of continuous memory and cache locality of code and data.

2.) If you need 1 billion instance then every additional data per object will increase you memory footprint. For example an additional pointer to your virtual function table (vptr) of 8 bytes will increase your memory requirements by 8 GBytes. Storing every type in a different vector without a virtual base class does not have this overhead.

2b) Yes you should avoid inheritance with virtual function if you aim for performance. The instruction cache will be trashed if virtual function are called with different implementations. At least you can sort your big vector by type to minimize this problem.

3.) You must use the pointer option to prevent slicing if you go for a base class with virtual functions.

4.) More information is needed and should be answered in separate question.

5.) Every indirection will degrade performance.

This is enlightening. Thank you – nnrales Mar 01 '16 at 00:13 — nnrales, Mar 01 '16 at 00:13

score 2 · Answer 3 · answered Feb 29 '16 at 21:53

1) Should I use std::vector<class> or std::vector<class*> ?

False dicotomy. There are a couple of other options:

boost::ptr_vector<class>
std::vector<std::unique_ptr<class>>
Probably even more.

Personally I like boost::ptr_vector<class> as it stores an owned pointer (thus memory allocation is done automatically). But when accessing members they are returned as reference to the object (not pointers). Thus using them with standard algorithms is vastly simplified over other techniques.

My class is quite complicated / big , and I think using the second option is better , as since std::vector tries to allocate memory contiguously and there might not be a contiguous block of memory to store a million class,

The real question here is if you can pre-calculate the maximum size of your vector and reserve() the required amount of space. If you can do this (and thus avoid any cost of copying) std::vector<class> would be the best solution.

This is because having the objects in contiguous storage is usually a significant advantage in terms of speed (especially when scanning a vector). The ability to do this should not be underestimated when you have huge datasets (especially in the billion range).

but when I just store pointers, the class does not have to be stored contiguously only the pointers have to stored and the computer might have space to do this. Is this reasoning correct?

By using pointers, you are also significantly increasing the amount of memory required by the application as you need to store the object and the pointer to the object. Over billions of objects this can be a significant cost.

2) As I said I will have millions of class, (for proper simulation I will need > billion of the class ) is inheritance a smart thing to use here ?

Impossible to say without much more information.

3) Also how do I store all these different class in a std::vector ? Can a std::vector or std::vector store class B , class C , class D which all inherit from the base class ?

But if you do use inheritance you will need not be able to use std::vector<class> directly. You will need to store a pointer to the base class. But that does not preclude the other three techniques.

4) In the previous version of the program , I used multi threading by making the different process handle different sections of the std::vector , is there a better way to do the threading ?

This seems a reasonable approach (assuming that the ranges don't overlap and are contiguous). Don't create more threads than you have available cores.

Should I use smart pointers ? Since I have so many objects , will they degrade performance ?

Use of unique_ptr over a normal pointer has zero overhead (assuming you don't use a custom deleter). The actual generated code will be basically equivalent.

Improve storage capacity / performance of std::vector

3 Answers3