Should my std::vector contain pointers or structs?

Question

I know that holding pointers incurs the overhead of an extra dereference operation but it saves me including the (potentially large) header file that contains the definition of my struct.

However my preference is to be determined by the advantage of having a std::vector<myStruct> *ptr2Vect member. Namely, not having to call delete on each element. How big a performance advantage is this? Can vector really allocate objects on the stack? I am fairly new to template classes and wonder if it could be possible for a dynamic array to expand on the stack and at what price?

_EDIT_

I fail in understanding default copy constructor and operator= members and am trying to keep things as simplistic structs. I have neither implementation defined explicitly so fear that making the vector element an object instead of pointer will create temporary object at assignment time that will be destructed and so ruin its copy.

_EDIT_

Sorry for the delay in delivering pertinent information (I am shy with my code).

I want to call push_back(newObj). Now if I don't use pointers I have a big problem in that I don't want to perform a deep copy but my dtor will free up the memory shared by the LHS and RHS of this invocation of the copy constructor.

You might also be interested in [boost pointer containers](http://www.boost.org/doc/libs/release/libs/ptr_container/doc/tutorial.html) — Flexo, Aug 27 '11 at 21:59
Yes, the whole boost project sounds fantastic, but I've enough to learn and I'd prefer to get a thorough grounding in the basics. — John, Aug 27 '11 at 22:00
By your wording, I think you may be confusing a pointer to a vector and a vector of pointers. `std::::vector Vect` would be a vector of pointers. — JohnPS, Aug 27 '11 at 22:05
Another option would be a vector of shared_ptr< T >. Originally also in Boost, but these days also available as part of the new C++0x/C++11 standard. — TeaWolf, Aug 27 '11 at 22:08
Avoiding an include isn't a very good basis for a decision. Whether you want a value or pointer is completely independent of a simple include. Maybe it's the include that needs to change, who knows. But if you want some structures in a vector, include the definition of the structure and use the vector. That's just how it is. — GManNickG, Aug 27 '11 at 22:17
Thanks GMan, I will put some more coins in the meter and move to a faster computer :) — John, Aug 27 '11 at 22:34
@JohnPS, as would std::::vector Vect. I use pointers to vectors however. — John, Aug 27 '11 at 22:55
In general, imagine you wanted to make an array. If you decided to use structs or pointers for any reasons, you can do the EXACT same thing with `vector` and the exact same reasons hold. — Shahbaz, Aug 27 '11 at 23:03
@GManNickG I wouldn't say that avoiding an include is a bad basis for a decision; it all depends on _what_ you are including, especially as some Boost libraries are very complex and can drastically increase your compiles times. — Skrylar, Jul 28 '12 at 09:19
Please refer to this https://stackoverflow.com/questions/2275076/is-stdvector-copying-the-objects-with-a-push-back — M.Hefny, Oct 27 '19 at 08:05

Flexo · Accepted Answer · 2011-08-28T00:08:12.500

8

As a general rule of thumb I'd say you probably don't want to put pointers in your containers, unless there's a good reason.

Possible reasons to consider pointers:

You have virtual functions
You have a class hierarchy
You don't know the size of the objects where you're using them this. (You can only use pointers or references in that case and you can't have a vector of references)
Your objects are exceedingly large (probably benchmark this)

The biggest reason not to put pointers in containers would be that it makes it much easier not to make a mistake and accidentally leak memory. This is especially true when you start to consider exceptions.

Not having pointers in your containers makes it much easier to use STL <algorithms>, consider:

#include <vector>
#include <string>
#include <iostream>
#include <iterator>
#include <algorithm>

int main() {
  std::vector<std::string> test;
  test.push_back("hello world");
  std::copy(test.begin(), test.end(), 
            std::ostream_iterator<std::string>(std::cout, "\n"));
}

Versus:

#include <vector>
#include <string>
#include <iostream>
#include <iterator>
#include <algorithm>

int main() {
  std::vector<std::string*> test;
  // if push_back throws then this will leak:
  test.push_back(new std::string("hello world"));
  // Can't do:
  std::copy(test.begin(), test.end(), 
            std::ostream_iterator<std::string>(std::cout, "\n"));
  // Will now leak too
}

(which I would never do)

Or possibly:

#include <vector>
#include <string>
#include <iostream>
#include <iterator>
#include <algorithm>

int main() {
  std::vector<std::string*> test;
  std::string str("hello world");
  test.push_back(&str);
  // Can't do:
  std::copy(test.begin(), test.end(), std::ostream_iterator<std::string>(std::cout, "\n"));
}

But the semantics of this one make me feel uncomfortable - it's not clear at all that delete elsewhere in the code would be a very bad thing and you still can't use STL algorithms very comfortably even if there is no leak issue.

edited Aug 28 '11 at 00:08

answered Aug 27 '11 at 22:01

Flexo

87,323
22
191
272

Sorry to be pedantic, but was that double negative intentional, "don't want to put pointers in your containers, unless there's a good reason not to."? – John Aug 27 '11 at 22:04
@John - corrected, I re-phrased that sentence as I was writing it with the net result that it made it very hard to parse. – Flexo Aug 27 '11 at 22:05
I am interested in your endorsement of exceptions. I don't use them so imagine I am home dry programming in my potentially unsafe way. I guess it is a throwback to my C days, but higher level languages should reduce code duplication so I will avoid try...catch blocks where possible and let the exception kill the process safely. – John Aug 27 '11 at 22:25
Good answer, what about I have really simple structs with only dtor and ctor and I don't trust default copy ctors to assign my initial value without creating a tempoary copy and then dtor'ing it? – John Aug 27 '11 at 23:07
@John - I avoid try and catch blocks, precisely by avoiding putting pointers in vectors. I can't quite follow your question in the next comment though. As a general comment if you don't trust the copy constructor (and it's accessible) I'd be more worried about that than containers specifically. (Also note that if it's an optimisation thing compilers are completely free to elide the copy constructor) – Flexo Aug 28 '11 at 00:02
1

I'm missing a great deal of STL documentation. By avoiding placing pointers in the vectors do you mean to say that the vector contracts not to throw an exception? – John Aug 28 '11 at 00:29
@John No, in both cases `push_back` can legitimately throw an exception (e.g. `std::bad_alloc`). The difference is though that if you're not using pointers in the container it's impossible for that to cause it to leak, whereas with pointers in the container you're going to have to do quite a lot of extra work to be sure it won't leak. – Flexo Aug 28 '11 at 09:17
An additional reason to use pointers would be if your struct has const data members. Or if you don't want the struct copyable, because the std::vector requires copy constructable and assignable elements. In a situation I had, there were const data members in the struct, so I put each struct in a shared_ptr. A safe solution. – rem45acp Oct 10 '12 at 14:05

score 3 · Answer 2 · answered Aug 27 '11 at 22:04

3

The "overhead" of a pointer dereference is essentially zero. That is, you would have great difficulty of measuring that versus referencing an object in its place. Beware of early optimization and over-optimization, the root of all programming evil.

You should do whichever (pointer or object) makes the most application sense.

answered Aug 27 '11 at 22:04

wallyk

56,922
16
83
148

My last question addressed the overhead of this form of addressing, which is another access to RAM, 10-15ns. That much I know, but the point of this question is where is the vector of values holding its dynamic array? – John Aug 27 '11 at 22:07
then the answer is surely that each block of memory, be it pointer or object is allocated on the heap? I will be x-compiling. – John Aug 27 '11 at 22:14
@wallky - the standard guarantees that `vector`s are contiguous – Flexo Aug 27 '11 at 22:15
@John Silver: Yes: Dynamically allocated objects are on the heap. – wallyk Aug 27 '11 at 22:16
1

I agree with "beware of early optimization", but your first sentence is really not true at all. Accessing random memory on a modern machine is 100-1000 times slower than accessing something in a register (or L1 cache). If your objects are small, the benefits of locality can make marching through a vector of objects _orders of magnitude_ faster than marching through a vector of pointers. (And you are completely wrong about how `vector` is implemented.) – Nemo Aug 27 '11 at 22:18
4

The cost of a pointer derference is certainly not zero. Storing pointers means you lose locality, which means cache misses which is one of the biggest performance killers on a modern CPU. You're right, of course, that one should do what best solves the problem at hand, but you're kidding yourself if you think there's no measurable cost to storing pointers vs the objects themselves. – jalf Aug 27 '11 at 22:20

score 3 · Answer 3 · answered Aug 27 '11 at 22:31

First, I agree with those who say to write your code however it makes the most sense, and do not worry about micro-optimizations like this until your profiling tool tells you to.

That said, you should be worried more about accessing your data, not allocating and freeing it. If your access patterns to vector elements have good locality -- e.g., looping through them all, or accessing nearby elements together -- then a vector of pointers is likely to destroy that locality and cause a major hit to performance.

The #1 concern for speed is using good algorithms, of course. But the #2 concern is having good locality, because memory is slow... And relative to the CPU, it gets slower every year.

So, for small, simple objects, vector<Obj> is almost certainly going to be faster than vector<Obj *>, and possibly much faster.

As for "can a vector really allocate objects on the stack", the answer is yes in terms of semantics, but no in terms of implementation (most likely). A typical vector implementation consists of three pointers internally: Base, Current, and End. All three point into a contiguous block on the heap, and the vector's destructor will deallocate that block. (Again, this is a typical implementation; in theory, your compiler and/or runtime might do something else. But I bet it doesn't.)

Such an implementation supports dynamic expansion by re-allocating that block and copying data. This is not as slow as it sounds for two reasons: (1) Linear memory access (e.g. copying) is pretty fast; and (2) each reallocation increases the size of the block by a factor, which means push_back is still O(1) amortized.

score 2 · Answer 4 · answered Aug 28 '11 at 00:51

The one thing not mentioned against pointers vs structs is continuity of memory (matters more on embedded). Basically, a vector of struct will be allocated in 1 block of memory while a vector of pointers to struct will (probably) be allocated all over the place. Fragmentation of memory and Data Cache will thus seriously suffer.

score 1 · Answer 5 · edited Aug 27 '11 at 22:28

Your question is not very clear. First you talk of a vector of pointers, and then you write something like: std::vector<myStruct> *ptr2Vect.

std::vector<myStruct> *ptr2Vect is a pointer to a vector of myStruct objects. The vector is not storing pointers, so you don't need to worry about memory management of the obects held - just need to ensure that myStruct is copy constructable. You do need to manually manage the clean up of the pointer to the vector though (ptr2Vect)
Most modern systems work very efficiently with pointers, if you are asking this kind of question, you're following the route for premature optimization, stop, take a step back.
vector relies on dynamic allocation, but how it expands, it manages (you can control it to an extent, for example, if you know the size before hand, you can reserve)

From what I gather from the question, you're really not at the point where you need to worry about the overhead of automatic/dynamic allocation and pointer de-referencing. These are the least of your concerns, just learn to write good code - all that other stuff will come later (if at all necessary)

Sorry for the confusion, I always use pointers to vectors for ease of passing and fear of invoking std::vector's copy constructor. — John, Aug 27 '11 at 22:16
@John Silver, "fear of invoking std::vector's copy constructor?" well that's a first for me! :) — Nim, Aug 27 '11 at 22:19
I need to read up on the vector copy constructor, as to who does the alloc'ing and freeing. — John, Aug 27 '11 at 22:33

score 0 · Answer 6 · edited May 23 '17 at 11:59

However my preference is to be determined by the advantage of having a std::vector *ptr2Vect member. Namey, not having to call delete on each element. How big a performance advantage is this?

it depends on the number of elements, but it can save you a ton of memory and time. see:

What is the cost of inheritance?

Can vector really allocate objects on the stack?

yes. a vector could reserve an internal allocation for this purpose, or the compiler could optimize this in some cases. that's not a feature/optimization you should rely on. you could create your own allocator or pod-array container tailored for your needs.

I am fairly new to template classes and wonder if it could be possible for a dynamic array to expand on the stack and at what price?

if you have a constant size, then a specific implementation (such as boost::array) can save a ton of runtime overhead. i've written several types for different contexts.

score 0 · Answer 7 · answered Aug 27 '11 at 22:21

My first suggestion to you would , 'you dont have to have pointer-to-vector' as member. You want a simple vector myVector; OR vector< myVector;

Secondly, you will make your decision based on following questions

What is the size of vector ? (how many elements max) say n
What is the size of struct ? (sizeof(T)) say s
What is the cost of copying the struct ? say c
Is your struct holding some resource ? (e.g. some file handle or semaphore et cetra) ? If it is holding some resource, then vector can complicate your life much more.

Now n,s,c are going to determine your runtime overhead of vector For vector, cost due to n,s,c are zero. For vector, cost due to n,s,c are n*s sizeUnits + n*c executionUnits.

My own rule of thumb : No rule of thumb exists. First code it with vector, if it is not good enough, then go with vector

If your program, is a small program which is going to exit the process after you have used your vector, then I wouldnt even bother to free them. IF NOT, then just run a
for(auto it=v.begin();it!=v.end();++it) delete *it;

Should my std::vector contain pointers or structs?

7 Answers7

Linked