C++ layout of an object containing containers

Question

As someone with a lot of assembler language experience and old habits to lose, I recently did a project in C++ using a lot of the features that c++03 and c++11 have to offer (mostly the container classes, including some from Boost). It was surprisingly easy - and I tried wherever I could to favor simplicity over premature optimization. As we move into code review and performance testing I'm sure some of the old hands will have aneurisms at not seeing exactly how every byte is manipulated, so I want to have some advance ammunition.

I defined a class whose instance members contain several vectors and maps. Not "pointers to" vectors and maps. And I realized that I haven't got the slightest idea how much contiguous space my objects take up, or what the performance implications might be for frequently clearing and re-populating these containers.

What does such an object look like, once instantiated?

With a class holding vectors and maps, nearly all of the footprint of said-object will be dynamic (left to the behest of the standard library, of course). Those containers will *likely* have little static footprint in your object class, and considerable *dynamic* footprint in how they maintain their content. — WhozCraig, Aug 24 '13 at 23:31
@whozcraig: By dynamic footprint, am I correct in thinking you mean memory that's not contiguous with the rest of my object, and that is allocated & deleted during execution without my knowledge? — Chap, Aug 25 '13 at 00:08
Precisely. Its all managed by the standard library. Some things are pretty straight forward (a vector, for example, but even there some things may surprise you when you review some implementations). While others can be rather elaborate (maps are often implemented as RB-trees, which are anything-but-trivial). I've seen implementations that single-alloc entries as they go, and others that use complicated paging algorithms and placement-`new` management of instances. It all depends on the implementation (which still must conform to the standard for interfacing and behavior). — WhozCraig, Aug 25 '13 at 01:05

James Kanze · Accepted Answer · 2013-08-24T23:37:10.207

Formally, there aren't any constraints on the implementation other than those specified in the standard, with regards to interface and complexity. Practically, most, if not all implementations derive from the same code base, and are fairly similar.

The basic implementation of vector is three pointers. The actual memory for the objects in the vector is dynamically allocated. Depending on how the vector was "grown", the dynamic area may contain extra memory; the three pointers point to the start of the memory, the byte after the last byte currently used, and the byte after the last byte allocated. Perhaps the most significant aspect of the implementation is that it separates allocation and initialization: the vector will, in many cases, allocate more memory than is needed, without constructing objects in it, and will only construct the objects when needed. In addition, when you remove objects, or clear the vector, it will not free the memory; it will only destruct the objects, and will change the pointer to the end of the used memory to reflect this. Later, when you insert objects, no allocation will be needed.

When you add objects beyond the amount of allocated space, vector will allocate a new, larger area; copy the objects into it, then destruct the objects in the old space, and delete it. Because of the complexity constrains, vector must grow the area exponentially, by multiplying the size by some fixed constant (1.5 and 2 are the most common factors), rather than by incrementing it by some fixed amount. The result is that if you grow the vector from empty using push_back, there will not be too many reallocations and copies; another result is that if you grow the vector from empty, it can end up using almost twice as much memory as necessary. These issues can be avoided if you preallocate using std::vector<>::reserve().

As for map, the complexity constraints and the fact that it must be ordered mean that some sort of balanced tree must be used. In all of the implementations I know, this is a classical red-black tree: each entry is allocated separately, in a node which contains two or three pointers, plus maybe a boolean, in addition to the data.

I might add that the above applies to the optimized versions of the containers. The usual implementations, when not optimized, will add additional pointers to link all iterators to the container, so that they can be marked when the container does something which would invalidate them, and so that they can do bounds checking.

Finally: these classes are templates, so in practice, you have access to the sources, and can look at them. (Issues like exception safety sometimes make the implementations less straight forward than we might like, but the implementations with g++ or VC++ aren't really that hard to understand.)

Thank you. In short, it sounds as if these are implemented in much the same way that I would if I were using straight C (only with a *whole* lot less effort on my part!) — Chap, Aug 25 '13 at 00:13
@Chap *exactly*. Anyone writing C-code in C++ is honestly doing themselves no favors at all. A *lot* of work has gone into designing the standard and its requirements for meeting it. You'd be *crazy* not to take advantage of the fruits of that labor. — WhozCraig, Aug 25 '13 at 01:09
@whozcraig Yep. I am rapidly becoming completely convinced of this. There are some folks I work with, however, who are a tougher sell. From what I'm hearing, recent C++ is very smart about what it does. (For example, I have *no* idea how to write an RB tree.) — Chap, Aug 25 '13 at 01:19
@Chap In case you're curious about the *algorithms* [see this](http://en.literateprograms.org/Red-black_tree_%28C%29) and [this](http://en.wikipedia.org/wiki/Red–black_tree), the former is C, and as you can imagine gets significantly altered when done in C++, but the general algorithms are the same. Anyway, welcome to the party. — WhozCraig, Aug 25 '13 at 01:23
@whozcraig Interesting that the C++ implementation, because of template functions that inline the comparison code, can actually perform *better* than the C version! — Chap, Aug 25 '13 at 01:32
@Chap FWIW: back when I was starting C++, we did have people arguing the performance issue. Until someone actually measured it; the C++ solutions were regularly faster than the C solutions, at least in the places where it counted. — James Kanze, Aug 25 '13 at 21:52

score 3 · Answer 2 · answered Aug 24 '13 at 23:01

3

A map is a binary tree (of some variety, I believe it's customarily a Red-Black tree), so the map itself probably only contains a pointer and some housekeeping data (such as the number of elements).

As with any other binary tree, each node will then contain two or three pointers (two for "left & right" nodes, and perhaps one to the previous node above to avoid having to traverse the whole tree to find where the previous node(s) are).

In general, vector shouldn't be noticeably slower than a regular array, and certainly no worse than your own implementation of a variable size array using pointers.

answered Aug 24 '13 at 23:01

Mats Petersson

126,704
14
140
227

It can also easily be a multiway tree such as a B-tree. – Jerry Coffin Aug 25 '13 at 04:56
@JerryCoffin I could be. In fact, all of the implementations I'm familiar with ultimately derive from the same code base, which uses a red-black tree. (There are other ways of balancing a binary tree as well, but the implementations I'm familiar with don't use these either.) – James Kanze Aug 25 '13 at 21:58
@JamesKanze: Though they're not included with any compiler (at least to my knowledge), I've seen implementations based on both AVL and B-trees. Haven't tested for conformance, but don't see any particular reason they couldn't. – Jerry Coffin Aug 26 '13 at 01:32
@JerryCoffin I'm pretty sure they'd be conformant; an AVL-tree, in particular, has exactly the same complexity as a red-black tree. All of the implementations I've seen, however, derive from Stepanov's original implementation, so there's nowhere near as much variety as one might otherwise expect. – James Kanze Aug 26 '13 at 09:25

Neil Kirk · Answer 3 · 2013-08-24T23:11:30.687

1

A vector is a wrapper for an array. The vector class contains a pointer to a contiguous block of memory and knows its size somehow. When you clear a vector, it usually retains its old buffer (implementation-dependent) so that the next time you reuse it, there are less allocations. If you resize a vector above its current buffer size, it will have to allocate a new one. Reusing and clearing the same vectors to store objects is efficient. (std::string is similar). If you want to find out exactly how much a vector has allocated in its buffer, call the capacity function and multiply this by the size of the element type. You can call the reserve function to manually increase the buffer size, in expectation of the vector taking more elements shortly.

Maps are more complicated so I don't know. But if you need an associative container, you would have to use something complicated in C too, right?

edited Aug 24 '13 at 23:11

answered Aug 24 '13 at 22:53

Neil Kirk

21,327
9
53
91

"*When you clear a vector, it retains its old buffer*" Actually there is no hard requirement on this, `clear()` may or may not keep the array allocated depending on your STL implementation. `resize(0)` guarantees the array isn't deleted, though. – syam Aug 24 '13 at 23:01
@syam Actually, there is a definite requirement with regards to `clear()`; `clear()` is not allowed to reduce the capacity, so it cannot free the memory. There is nothing implementation defined here. – James Kanze Aug 24 '13 at 23:33
@JamesKanze can you give some proof of your claims? – Neil Kirk Aug 25 '13 at 01:13
Just the standard. The standard specifies quite clearly when iterators may be invalidated, and when `capacity` may change. `clear()` is not one of the cases. – James Kanze Aug 25 '13 at 21:48
@NeilKirk `clear` invalidates all iterators, but it doesn't invalidate guarantees with regards to future iterators, based on `capacity`. Consider: `v.reserve(5); v.clear(); v.push_back(v1); Iter it = v.begin(); v.push_back(v2);` `it` is guaranteed to be valid after the second `push_back`. – James Kanze Aug 26 '13 at 09:31
@NeilKirk So the site disagrees with the standard. The standard is clear: `vector<>::clear()` is not allowed to change the capacity, and the fact that the second `push_back` in my example can't trigger an increase in capacity means that previous iterators must remain valid. – James Kanze Aug 27 '13 at 12:15
@JamesKanze What are you saying?? That is the website I am always told to use by members of this site and I get flamed for using cplusplus.com Are you saying the golden boy is wrong?? – Neil Kirk Aug 27 '13 at 13:25
@NeilKirk It's clearly wrong in this case. And I've only seen it recommended instead of cplusplus.com; it's not wrong that often, where as it's hard to find a page without an error on cplusplus.com. But anyway, the only reference _is_ the standard; the rules of the game say that the standard is right, even when it is wrong. (In this case, I would _expect_ a function called `clear()` to put the object in the state it would have if it had just been default constructed. The standard says otherwise, however.) – James Kanze Aug 27 '13 at 13:53
@JamesKanze http://stackoverflow.com/questions/18467624/what-does-the-standard-say-about-how-calling-clear-on-a-vector-changes-the-capac – Neil Kirk Aug 27 '13 at 14:02

score 1 · Answer 4 · answered Aug 24 '13 at 23:46

Just wanted to add to the answers of others few things that I think are important.

Firstly, the default (in implementations I've seen) sizeof(std::vector<T>) is constant and made up of three pointers. Below is excerpt from GCC 4.7.2 STL header, the relevant parts:

template<typename _Tp, typename _Alloc>
struct _Vector_base
{
 ...
 struct _Vector_impl : public _Tp_alloc_type
 {
  pointer _M_start;
  pointer _M_finish;
  pointer _M_end_of_storage;
  ...
 };
 ...
 _Vector_impl _M_impl;
 ...
};

template<typename _Tp, typename _Alloc = std::allocator<_Tp> >
class vector : protected _Vector_base<_Tp, _Alloc>
{
 ...
};

That's where the three pointers come from. Their names are self-explanatory, I think. But there is also a base class - the allocator. Which takes me to my second point.

Secondly, std::vector< T, Allocator = std::allocator<T>> takes second template parameter that is a class that handles memory operations. It's through functions of this class vector does memory management. There is a default STL allocator std::allocator<T>>. It has no data-members, only functions such as allocate, destroy etc. It bases its memory handling around new/delete. But you can write your own allocator and supply it to the std::vector as second template parameter. It has to conform to certain rules, such as functions it provides etc, but how the memory management is done internally - it's up to you, as long as it does not violate logic of std::vector relies on. It might introduce some data-members that will add to the sizeof(std::vector) through the inheritance above. It also gives you the "control over each bit".

Most implementations of `std::vector` (including the one in g++) add extra members in order to track iterators, unless you're optimizing. (g++ optimizes by default; normally, you'll use the options `-D_GLIBCXX_CONCEPT_CHECKS -D_GLIBCXX_DEBUG -D_GLIBCXX_DEBUG_PEDANTIC` when compiling with g++.) — James Kanze, Aug 24 '13 at 23:55
@JamesKanze +1 Good to know, thanks for the info. This seems to be, let's say, "debug mode". I haven't noticed something like that in "release mode". — lapk, Aug 25 '13 at 00:00
I'm not sure what "debug mode" and "release mode" really mean; none of the compilers I use have "modes". Visual Studios does use the terms (although no one would really use either of the "modes" with their default settings), but in that case, there is at least some iterator tracking in the default configurations for both modes. — James Kanze, Aug 25 '13 at 21:40

syam · Answer 5 · 2013-08-25T00:10:14.560

0

Basically, a vector is just a pointer to an array, along with its capacity (total allocated memory) and size (actually used elements):

struct vector {
    Item* elements;
    size_t capacity;
    size_t size;
};

Of course thanks to encapsulation all of this is well hidden and the users never get to handle the gory details (reallocation, calling constructors/destructors when needed, etc) directly.

As to your performance questions regarding clearing, it depends how you clear the vector:

Swapping it with a temporary empty vector (the usual idiom) will delete the old array: std::vector<int>().swap(myVector);
Using clear() or resize(0) will erase all the items and keep the allocated memory and capacity unchanged.

If you are concerned about efficiency, IMHO the main point to consider is to call reserve() in advance (if you can) in order to pre-allocate the array and avoid useless reallocations and copies (or moves with C++11). When adding a lot of items to a vector, this can make a big difference (as we all know, dynamic allocation is very costly so reducing it can give a big performance boost).

There is a lot more to say about this, but I believe I covered the essential details. Don't hesitate to ask if you need more information on a particular point.

Concerning maps, they are usually implemented using red-black trees. But the standard doesn't mandate this, it only gives functional and complexity requirements so any other data structure that fits the bill is good to go. I have to admit, I don't know how RB-trees are implemented but I guess that, again, a map contains at least a pointer and a size.

And of course, each and every container type is different (eg. unordered maps are usually hash tables).

edited Aug 25 '13 at 00:10

answered Aug 24 '13 at 22:55

syam

14,701
3
41
65

There are a number of errors with your description: swapping with an empty vector will never delete anything; it just changes ownership. `delete[]` is never used. And `clear()` will never free memory; there is nothing which depends on the implementation. And the standard requires an exponential growth strategy, so not calling `reserve()` isn't normally that bad. – James Kanze Aug 24 '13 at 23:30
@JamesKanze Concerning the "swapping with an empty vector" you are right of course. I was thinking about the usual idiom *swapping with a temporary empty vector* `std::vector().swap(myVector)` but I didn't express myself correctly. Fixing it... – syam Aug 24 '13 at 23:38
@JamesKanze Concerning `clear()`, do you know the proper paragraph/verse? All I can find is 23.2.3, table 100 (sequence containers) where there is no mention of the capacity at all, effectively leaving the implementation free to do what it wants (provided there is no other clause somewhere mandating to explicitly keep the capacity intact). – syam Aug 24 '13 at 23:42
`clear()` is defined as `erase(begin(), end())`. There's no text in the standard which allows either `clear` or `erase` to change the capacity. And functions can't have observable side effects other than those documented. – James Kanze Aug 24 '13 at 23:45
@JamesKanze `clear()` seems to be defined in terms of `erase()` only for `basic_string` and associative containers (I just searched through the whole standard for all instances of `clear()`). Concerning vectors and sequence containers, again, all I can find is 23.2.3, table 100 which definitely doesn't define `clear()` in terms of `erase`, it just says it destroys all elements. BTW [it seems I'm not the only one to interpret it that way](http://en.cppreference.com/w/cpp/container/vector/clear). (continued...) – syam Aug 25 '13 at 00:07
(...) However, I have to admit that usually the standard explicitly states when a behaviour is implementation-dependent so I guess you are right in the end. – syam Aug 25 '13 at 00:09
@JamesKanze Concerning your last point (`reserve()`) I have been bitten by it sometimes (or rather by its absence) and adding it gave me a big performance boost, so I still stand by what I said. – syam Aug 25 '13 at 00:12
If the profiler says that reallocation is expensive, then you can consider `reserve`. In practice, I'm currently working on some very performance critical code, and we almost never use `reserve`, on one hand, because we have no idea as to what the maximum number of elements might be before actually initializing, and on the other, because the profiler has never show reallocations to be a significant problem. Using `reserve` before the profiler says you have to is premature optimization. – James Kanze Aug 25 '13 at 21:47

C++ layout of an object containing containers

5 Answers5