push_back objects into vector memory issue C++

Question

Compare the two ways of initializing vector of objects here.

1.
    vector<Obj> someVector;
    Obj new_obj;
    someVector.push_back(new_obj);

2.
    vector<Obj*> ptrVector;
    Obj* objptr = new Obj();
    ptrVector.push_back(objptr);

The first one push_back actual object instead of the pointer of the object. Is vector push_back copying the value being pushed? My problem is, I have huge object and very long vectors, so I need to find a best way to save memory.

Is the second way better?
Are there other ways to have a vector of objects/pointers that I can find each object later and use the least memory at the same time?

It moves it in C++11 if it can, and the first one is trying to push a function on. — chris, Sep 25 '13 at 02:49
@chris What do I need to write explicitly in the code to let it move but not copy? Or specify compilation using C++11? Or I don't have to write anything it just does it for me? The first one there I meant a constructor. — Logan Yang, Sep 25 '13 at 02:54
http://stackoverflow.com/questions/3106110/what-is-move-semantics — Ed S., Sep 25 '13 at 02:55
@LoganYang, `-std=c++11` is all you need. `push_back` has an overload taking an rvalue reference (which won't be used if you make the object separately like that and don't use something like `std::move` to pass it in before not using the object again), and there's also `emplace_back`. — chris, Sep 25 '13 at 03:02
@EdS. Thanks for the link, I'll just learn about the move semantics here — Logan Yang, Sep 25 '13 at 03:31

score 2 · Accepted Answer · answered Sep 25 '13 at 03:12

Of the two above options, this third not included one is the most efficient:

std::vector<Obj> someVector;
someVector.reserve(preCalculatedSize);
for (int i = 0; i < preCalculatedSize; ++i)
  someVector.emplace_back();

emplace_back directly constructs the object into the memory that the vector arranges for it. If you reserve prior to use, you can avoid reallocation and moving.

However, if the objects truly are large, then the advantages of cache-coherency are less. So a vector of smart pointers makes sense. Thus the forth option:

std::vector< std::unique_ptr<Obj> > someVector;
std::unique_ptr<Obj> element( new Obj );
someVector.push_back( std::move(element) );

is probably best. Here, we represent the lifetime of the data and how it is accessed in the same structure with nearly zero overhead, preventing it from getting out of sync.

You have to explicitly std::move the std::unique_ptr around when you want to move it. If you need a raw pointer for whatever reason, .get() is how to access it. -> and * and explicit operator bool are all overridden, so you only really need to call .get() when you have an interface that expects a Obj*.

Both of these solutions require C++11. If you lack C++11, and the objects truly are large, then the "vector of pointers to data" is acceptable.

In any case, what you really should do is determine which matches your model best, check performance, and only if there is an actual performance problem do optimizations.

I would suggest "vector of smart pointers to data", not "vector of pointers to data" — Ed S., Sep 25 '13 at 04:03

score 1 · Answer 2 · edited May 23 '17 at 10:27

If your Obj class doesn't require polymorphic behavior, then it is better to simply store the Obj types directly in the vector<Obj>.

If you store objects in vector<Obj*>, then you are assuming the responsibility of manually deallocating those objects when they are no longer needed. Better, in this case, to use vector<std::unique_ptr<Obj>> if possible, but again, only if polymorphic behavior is required.

The vector will store the Obj objects on the heap (by default, unless you override the allocator in the vector template). These objects will be stored in contiguous memory, which can also give you better cache locality, depending upon your use case.

The drawback to using vector<Obj> is that frequent insertion/removal from the vector may cause reallocation and copying of your Obj objects. However, that usually will not be the bottleneck in your application, and you should profile it if you feel like it is.

With C++11 move semantics, the implications of copying can be much reduced.

I think the OP is worried about the copy itself though. Any answer here should mention move semantics and the fact that, if using pointers, you now have to deallocate each element yourself. — Ed S., Sep 25 '13 at 02:54

Vaughn Cato · Answer 3 · 2013-09-25T03:03:30.903

Using a vector<Obj> will take less memory to store if you can reserve the size ahead of time. vector<Obj *> will necessarily use more memory than vector<Obj> if the vector doesn't have to be reallocated, since you have the overhead of the pointers and the overhead of dynamic memory allocation. This overhead may be relatively small though if you only have a few large objects.

However, if you are very close to running out of memory, using vector<Obj> may cause a problem if you can't reserve the correct size ahead of time because you'll temporarily need extra storage when reallocating the vector.

Having a large vector of large objects may also cause an issue with memory fragmentation. If you can create the vector early in the execution of your program and reserve the size, this may not be an issue, but if the vector is created later, you might run into a problem due to memory holes on the heap.

I'll simply stick to the first choice then. Thanks! – Logan Yang Sep 25 '13 at 03:38 — Logan Yang, Sep 25 '13 at 03:38

score 0 · Answer 4 · edited May 23 '17 at 12:23

Under the circumstances, I'd consider a third possibility: use std::deque instead of std::vector.

This is kind of a halfway point between the two you've given. A vector<obj> allocates one huge block to hold all the instances of the objects in the vector. A vector<obj *> allocates one block of pointers, but each instance of the object in a block of its own. Therefore, you get N objects plus N pointers.

A deque will create a block of pointers and a number of blocks of objects -- but (at least normally) it'll put a number of objects (call it M) together into a single block, so you get a block of N/M pointers, and N/M of objects.

This avoids many of the shortcomings of either a vector of objects or a vector of pointers. Once you allocate a block of objects, you never have to reallocate or copy them. You do (or may) eventually have to reallocate the block of pointers, but it'll be smaller (by a factor of M) than the vector of pointers if you try to do it by hand.

One caveat: if you're using Microsoft's compiler/standard library, this may not work very well -- they have some strange logic (still present up through VS 2013 RC) that means if your object size is larger than 16, you'll get only one object per block -- i.e., equivalent to your vector<obj *> idea.

push_back objects into vector memory issue C++

4 Answers4