1

I'm aware of the many articles on avoiding heap fragmentation. My question has to do with specifically what happens when we use a vector to store data:

class foo{
public:
std::vector<bar>; // Or I can have std::vector<bar*>
};

I have no problem using new or delete (I don't have memory leaks, and it is very clear to me when a bar is not used and I can call delete when necessary). But with regarding to heap fragmentation, which is better ? Does this make a stack overflow more likely ?

EDIT: I don't actually have any new information to add. I just wanted to thank everyone, and say that I've found that any question that is tagged with C++ seems to attract so many knowledgeable helpful people. Its really quite nice. Thank you.

Rahul Iyer
  • 19,924
  • 21
  • 96
  • 190
  • 1
    If you don't need pointers, don't use pointers. – Baum mit Augen Jul 10 '14 at 07:55
  • I'm trying to avoid heap fragmentation. But I also don't wan't to run the risk of stack overflow. While in general it may be preferable to avoid pointers, I could have many bars. A detailed answer with an explanation could help me understand my options better. – Rahul Iyer Jul 10 '14 at 07:57
  • 1
    I'm pretty sure, that vector will allocate the space for "bars" on the heap anyway. So if you have std::vector that doesn't mean, that the space for "bars" will be allocated on the stack. – JustAnotherCurious Jul 10 '14 at 08:00
  • What does the stack have to do with any of this? I don't see how the danger of a stack overflow arises. –  Jul 10 '14 at 08:00
  • If I create a bar using new, then it will be on the heap. But if I do bar = bar() then it will be on the stack. – Rahul Iyer Jul 10 '14 at 08:01
  • Yes, a single `bar` will temporarily live on the stack (or maybe a register?), but before long you'll put it into the vector. –  Jul 10 '14 at 08:03
  • @John, when you do bar = bar(), vec.push_back(bar), then the vector allocate it's own memory for bar and copy the bar that you created on the stack to it's own memory in the heap. – JustAnotherCurious Jul 10 '14 at 08:03
  • So I should just use pointers ? – Rahul Iyer Jul 10 '14 at 08:13
  • @John They're saying the opposite. Values! Almost always values! – Joseph Mansfield Jul 10 '14 at 08:26
  • Got it! I will use values. – Rahul Iyer Jul 10 '14 at 08:31

4 Answers4

2

TL;DR, unless you have a good reason, with sound measurements, always favour using a vector with value types.

I would argue for the std::vector<bar> form. Unless you have other (measured) reasons why, the guaranteed contiguous memory of vector with a value type is a better idea. When the vector allocates a contiguous block of memory, the objects will be laid out in memory next to each other (with some reserve). This allows the host system and runtime a better chance to avoid fragmentation. Allocating the objects individually may result in fragmentation since you do not control where they are allocated.

I note your concerns around the stack; vector has a small stack allocation so this shouldn't be a problem, the "usual" allocator std::allocator uses new to allocate memory for the vector on the heap.

Why should a vector always be favoured? Herb Sutter goes into some detail about this; http://channel9.msdn.com/Events/Build/2014/2-661 with supporting graphs, diagrams, explanations etc. from around the 23:30 mark and he picks up Bjarne's material at around the 46:00 mark.

Niall
  • 30,036
  • 10
  • 99
  • 142
  • If you have more sources on stl containers performance, please share! – Zsolt Jul 10 '14 at 08:05
  • @Zsolt, I'd like to say I have a list, but I don't really keep one. The other link I have is http://bulldozer00.com/2012/02/09/vectors-and-lists/ – Niall Jul 10 '14 at 08:07
  • I'm sorry but I don't understand - What does this have to do with heap fragmentation ? – Rahul Iyer Jul 10 '14 at 08:11
  • @John, allocating the objects individually may result in fragmentation since you do not control where they are allocated. When the `vector` allocates a contiguous block of memory, the objects will be laid out in memory next to each other (with some reserve). This allows the host and runtime a better chance to avoid fragmentation. – Niall Jul 10 '14 at 08:15
  • Ok. I think I understand now - so I would just let vector create a copy of my object ? – Rahul Iyer Jul 10 '14 at 08:17
  • 1
    @John, correct; in addition if you are able to use C++11 language features, and enable moving `bar`, it would do one better and just move it into the container. – Niall Jul 10 '14 at 08:19
  • How would I enable that ? I don't know what I should be googling! – Rahul Iyer Jul 10 '14 at 08:28
  • Also a lot of compilers use NRVO - Named return value optimization, so it can avoid a couple of copies for you. Have some reading: http://stackoverflow.com/questions/6233879/move-or-named-return-value-optimization-nrvo http://en.wikipedia.org/wiki/Return_value_optimization – Zsolt Jul 10 '14 at 08:29
  • @John, what you are looking for are "move semantics". http://stackoverflow.com/q/3106110/3747990 – Niall Jul 10 '14 at 08:30
  • @John I would say move semantics are a bit complicated for you at the moment. Come back to it later. But certainly, if you have C++11 features, try using `vector.emplace_back(...)` to put your objects in the vector efficiently. – Joseph Mansfield Jul 10 '14 at 08:41
  • @Niall Thank you for your answer. Both yours and Zsolts are useful and seem to say the same thing. Zsolt posted earlier so I marked his as the correct answer. @ Joseph Mansfield Thank you for your answer. – Rahul Iyer Jul 10 '14 at 10:12
  • @John, no problem. I thought it was a nice question and I think you have a lot of material to work with. – Niall Jul 10 '14 at 10:20
1

The std::vector guarantees, that it's contents will be stored in continuous memory. If you store objects in your vector, then each object will be next to each other in a big chunk of memory, but if you put pointers to the vector, which point to objects created with new, then each object will be in random places in memory.

So the first version (storing objects, not pointers) is better for both avoiding heap fragmentation and utilizing the cache.

Zsolt
  • 582
  • 2
  • 5
  • what about stack overflow ? How will I know if I'm running out of space on the stack ? – Rahul Iyer Jul 10 '14 at 08:03
  • 2
    `std::vector` uses `new` to allocate space, so apart from the couple dozen of bytes the container takes, it won't use the stack, so you are fine. – Zsolt Jul 10 '14 at 08:04
  • @John: Why would stack overflow come into this? You only have a single `vector` either way, and `vectors` have a fixed size, regardless of what they contain. Vectors are implemented as a set of pointers to a region of free-store memory. – Mankarse Jul 10 '14 at 08:06
  • I guess I got confused because I was thinking of storing "bar" instead of "bar*" in the vector. – Rahul Iyer Jul 10 '14 at 08:15
  • By the way, what cache are you talking about ? – Rahul Iyer Jul 10 '14 at 08:16
  • The cache between your processor and RAM. – Zsolt Jul 10 '14 at 08:27
1

OP seems to have a basic misunderstanding about what std::vector does: It allocates a contiguous block of memory on the heap, you do not need to worry about your stack-space when using vector! So whenever you do not really need to store anything but the objects them self, just put them in your vector and you are good.

So std::vector<bar> appears to be the right choice in your case. Also, don't worry about heap fragmentation prematurely, let the compiler worry about that. (But you should worry about feeding pre-fetcher, that is why std::vector<bar> should be your standard choice to store anything.)

Baum mit Augen
  • 49,044
  • 25
  • 144
  • 182
1

Use a vector of pointers in the following cases:

  • Bar is a base class and you want to store instances of derived classes in the vector.
  • Bar doesn't have an appropriate copy/move constructor or it is prohibitively expensive.
  • You want to store pointers to objects in the vector elsewhere, and you may change the capacity of the vector (for example, due to resize or push_back). If the objects are stored by value, old pointers will become invalid when the capacity changes.
  • You want some elements in the vector to be "empty", (which can be represented by null pointers), and it's not practical to store dummy, unused instances of Bar in the vector by value instead.

If you do need pointers, I strongly suggest using smart pointers.

In other cases, you should prefer to store objects by value, as it is usually more efficient.

Neil Kirk
  • 21,327
  • 9
  • 53
  • 91