213

I've been reading up on STL containers in my book on C++, specifically the section on the STL and its containers. Now I do understand each and every one of them have their own specific properties, and I'm close to memorizing all of them... But what I do not yet grasp is in which scenario each of them is used.

What is the explanation? Example code is much prefered.

Jonas
  • 121,568
  • 97
  • 310
  • 388
Daniel Sloof
  • 12,568
  • 14
  • 72
  • 106
  • do you mean map, vectot, set etc? – Thomas Tempelmann Jan 23 '09 at 00:35
  • Even looking at this diagram I can't say what would be the best one to use in my quastion http://stackoverflow.com/questions/9329011/mfc-dictionary-collection-with-key-unicity-and-ordering-by-position – sergiol Feb 24 '12 at 19:32
  • I've created an [Editable Version of Mikael's C++ Container Cheat Sheet](https://github.com/parkershamblin/container-flow-chart). This is @MikaelPersson's cheat sheet. Sadly, I can't comment under his answer because I don't have 50 reputation yet. – Parker Shamblin Aug 14 '20 at 19:53

10 Answers10

383

This cheat sheet provides a pretty good summary of the different containers.

See the flowchart at the bottom as a guide on which to use in different usage scenarios:

http://linuxsoftware.co.nz/containerchoice.png

Created by David Moore and licensed CC BY-SA 3.0

zdan
  • 28,667
  • 7
  • 60
  • 71
  • 17
    This flowchart is golden, I wish I had something like that in c# – Bruno Feb 01 '13 at 15:35
  • 2
    Updated link: [C++ Containers Cheat Sheet](http://homepages.e3.net.nz/~djm/cppcontainers.html). – Bill Door Jan 02 '14 at 22:56
  • 3
    Starting point must be `vector` rather then empty. http://stackoverflow.com/questions/10699265/how-can-i-efficiently-select-a-standard-library-container-in-c11 – eonil Feb 18 '14 at 19:34
  • 8
    You now have `unordered_map` and `unordered_set` (and their multi variants) which are not in the flow chart but are good picks when you don't care about order but need to find elements by key. Their lookup is usually O(1) instead of O(log n). – Aidiakapi May 12 '14 at 18:38
  • 1
    This could do with a question about whether you intend to iterate over the container. – Tim MB Jun 23 '14 at 16:47
  • 1
    Where would `std::array` go in this picture? – Ryan Dougherty Jul 29 '14 at 20:47
  • 1
    @Ryan, where it says "size will vary widely" if there was an option that size will never under any circumstances vary then `std::array` might go there. – shuttle87 Jun 18 '15 at 14:03
  • 2
    @shuttle87 not just that size will never vary, but more importantly that size is determined at compile time and will never vary. – YoungJohn Jul 24 '15 at 20:51
  • 2
    I don't think "need to insert/erase in middle" automatically means `list`. In most cases it will *still* be faster to use `vector`. – Timmmm Feb 23 '18 at 13:05
  • The links to the references seems to be broken. Does anyone have the updated sources? – AmeyaVS Dec 18 '18 at 06:41
  • The choice branching to `stack` etc should be "(`const`) iterate elements" (and flip yes/no). The only way to read elements of the container adaptors other than the "next" is to mutate them. – Caleth Dec 19 '18 at 14:40
  • @Timmmm: Perhaps, but the potential cost of middle insertion are much greater for `vector` than `list`. Assuming you don't already have an iterator at the insertion point (worst case for `list`), you must traverse `O(n)` times to find the insertion point, then allocate and insert once. For `vector`, you may be able to find the location in `O(1)` or `O(log n)`, but you'll perform `O(n)` moves (invalidating many iterators), and possibly reallocate underlying storage (which invalidates *all* outstanding iterators, and can be very expensive for types with expensive moves, e.g. `std::array`). – ShadowRanger Jan 29 '19 at 13:11
  • Point is, yes, for something like a sequence of 100 `int`s or the like, `vector` will probably be faster for middle insertion than `list` in many circumstances. But for more expensive types, `vector` will be slower, and if you need to be able to store off iterators that persist after insertion, `vector` is completely unsuitable. – ShadowRanger Jan 29 '19 at 13:12
  • Yes of course if your elements are large enough `list` will become quicker. But that size is [something like 128 bytes](https://baptiste-wicht.com/posts/2012/12/cpp-benchmark-vector-list-deque.html). But I would say it is pretty rare to have elements that large. – Timmmm Jan 29 '19 at 18:19
  • Even though it's off-topic a bit, [Abseil containers](https://abseil.io/docs/cpp/guides/container) should also be considered. Especially the Swiss [Hash] Tables are very efficient for small items. – Kuba hasn't forgotten Monica Sep 05 '19 at 13:20
  • what does "size vary widely" mean that decides between deque and vector? – WriteBackCMO Mar 12 '21 at 01:07
214

Here is a flowchart inspired by David Moore's version (see above) that I created, which is up-to-date (mostly) with the new standard (C++11). This is only my personal take on it, it's not indisputable, but I figured it could be valuable to this discussion:

enter image description here

Mikael Persson
  • 18,174
  • 6
  • 36
  • 52
  • Can you share a link to the original source (some kind of flowchart file)? – kevinarpe May 02 '15 at 09:56
  • @kevinarpe I drew it by hand (svg). I did not use any flowchart generation software. – Mikael Persson May 02 '15 at 12:59
  • 4
    Can you make the original available? It is an excellent chart. Maybe stick on a blog or GitHub? – kevinarpe May 02 '15 at 13:28
  • I think that the arrow from "In-order Traversals" shouldn not only lead to `vector`. It is not suitable container when elements are often added or removed. – Piotr Siupa Jul 26 '15 at 13:37
  • Should the use of associative containers to model *sparse arrays* be included in this flowchart? – Emile Cormier Dec 27 '15 at 16:58
  • 1
    This is an excellent chart. Though can someone explain to me what is meant by 'persistent positions'? – IDDQD Apr 26 '16 at 15:52
  • 4
    @S.T.A.L.K.E.R. Persistent positions means that if you have a pointer or iterator to an element in the container, that pointer or iterator will remain valid (and pointing to the same element) regardless of what you add or remove from the container (as long as it is not the element in question). – Mikael Persson Apr 29 '16 at 04:22
  • 1
    This is truly a great chart, however I think `vector (sorted)` is a bit inconsistent with the rest. It is not a different type of container, just the same `std::vector` but sorted. Even more important, I don't see why one couldn't use a `std::set` for ordered iteration if that is the standard behavior of iterating trough a set. Sure, if the answer is talking about orderly accessing the values of the container trough `[]`, then ok you can only do that with a soted `std::vector`. But in either case, the decision should be made just after the "order is needed" question – RAs Oct 14 '16 at 20:27
  • 1
    @user2019840 I wanted to restrict the chart to standard containers. What should appear in place of "sorted vector" is "flat_set" (from [Boost.Container](http://www.boost.org/doc/libs/1_62_0/doc/html/boost/container/flat_set.html)), or equivalent (every major library or code-base has a flat_set equivalent, AFAIK). But these are non-standard, and quite a glaring omission from the STL. And the reason why you don't want to iterate through std::set or std::map (at least not frequently) is that it is [very inefficient to do so](http://lafstern.org/matt/col1.pdf). – Mikael Persson Nov 27 '16 at 07:38
  • Couold someone please define "Look-up key"? – Post Self Jan 21 '17 at 21:00
  • @PostSelf: They just mean "looking up values using keys" (in maps) and "membership testing" (in sets). The point is, if you're using `count`/`find`/`operator[]` and/or regularly doing things that implicitly need to perform a lookup (like insertion, which has to figure out where to put the item), then you will prefer `set`/`map` and their `unordered_` equivalents, because `O(log n)`/`O(1)` work beats `O(n)` work once the sizes become meaningful. – ShadowRanger Jan 29 '19 at 15:00
  • why is `std::array` not included in this chart? – rehctawrats Oct 20 '20 at 15:03
46

Simple answer: use std::vector for everything unless you have a real reason to do otherwise.

When you find a case where you're thinking, "Gee, std::vector doesn't work well here because of X", go on the basis of X.

YSC
  • 38,212
  • 9
  • 96
  • 149
David Thornley
  • 56,304
  • 9
  • 91
  • 158
  • 1
    However .. be careful not to delete / insert items when iterating ... use const_iterator as far as possible to avoid this .. – vrdhn Mar 20 '12 at 16:13
  • 12
    Hmm... I think people are over-using vector. The reason is, that the "doesn't work"-case won't happen easily - so people stick to the most often used container and misuse it for storing lists, queues, ... In my oppinion - which matches the flowchart - one should choose the container based on the intended use instead of applying the "one seems to fit all". – Black Apr 13 '12 at 13:50
  • 15
    @Black Point is, vector is usually faster even on operations that in theory should work slower. – Bartek Banachewicz Feb 07 '13 at 00:30
  • @Black I believe that the *doesn't work cases* comes up easily only when we consider *optimization*. – eonil Feb 18 '14 at 19:01
  • 2
    @Vardhan `std::remove_if` is almost always superior to the "delete during iteration" approach. – fredoverflow Apr 29 '14 at 06:35
  • Yes! Good answer. The flowcharts are misleading. – towi Oct 18 '16 at 17:58
  • 1
    Some benchmarks would really help this discussion to be less subjective. – Felix D. Dec 08 '19 at 13:02
  • Use `std::vector` for any list-type-thing, and `std::unordered_map` for any map-or-set-type-thing, unless you have a real reason to do otherwise. – Mooing Duck Aug 09 '20 at 20:25
  • @FelixD.: I think this isn't a benchmark-type thing so much as "In my 10+ years of programming experience, 90% of the time the right containers are `std::vector`, and 7% of the time the right container is `std::unordered_map`, and the remaining 3% is all the others. – Mooing Duck Aug 09 '20 at 20:28
  • https://dzone.com/articles/c-benchmark-%E2%80%93-stdvector-vs#:~:text=Whatever%20the%20data%20size%20is,allocate%20memory%20for%20each%20element. is one of many many benchmarks showing `std::vector` is far faster than `std::list` for random inserts/removals (unless it's used as a secondary data structure). The _only_ benchmarks that standalone `std::list` won were push_front, and sorting of 1KB objects. – Mooing Duck Aug 09 '20 at 20:30
11

Look at Effective STL by Scott Meyers. It's good at explaining how to use the STL.

If you want to store a determined/undetermined number of objects and you're never going to delete any, then a vector is what you want. It's the default replacement for a C array, and it works like one, but doesn't overflow. You can set its size beforehand as well with reserve().

If you want to store an undetermined number of objects, but you'll be adding them and deleting them, then you probably want a list...because you can delete an element without moving any following elements - unlike vector. It takes more memory than a vector, though, and you can't sequentially access an element.

If you want to take a bunch of elements and find only the unique values of those elements, reading them all into a set will do it, and it will sort them for you as well.

If you have a lot of key-value pairs, and you want to sort them by key, then a map is useful...but it will only hold one value per key. If you need more than one value per key, you could have a vector/list as your value in the map, or use a multimap.

It's not in the STL, but it is in the TR1 update to the STL: if you have a lot of key-value pairs that you're going to look up by key, and you don't care about their order, you might want to use a hash - which is tr1::unordered_map. I've used it with Visual C++ 7.1, where it was called stdext::hash_map. It has a lookup of O(1) instead of a lookup of O(log n) for map.

Mark Krenitsky
  • 347
  • 2
  • 7
  • I've heard a couple of anecdotes now suggesting that Microsoft's `hash_map` isn't a very good implementation. I hope they did better on `unordered_map`. – Mark Ransom Jun 20 '11 at 19:01
  • 3
    Of lists - "you can't sequentially access an element." - I think you mean you can't random-access or index directly to an element.... – Tony Delroy Jul 08 '12 at 04:12
  • ^ Yes, because sequential access is precisely what a `list` does. Rather glaring error there. – underscore_d Nov 28 '15 at 13:01
7

I redesigned the flowchart to have 3 properties:

  1. I think STL containers are devided to 2 main classes. The basic containers and those leverages the basic containers to implement a policy.
  2. At first the flowchart should divide the decision process to the main situations we should decide on and then elaborate on each case.
  3. Some extended containers have the possibility of choosing different basic container as their inner container. The Flowchart should consider the situations in which each of the basic containers can be used.

The flowchart: enter image description here

More info provided in this link.

Ebrahim
  • 109
  • 1
  • 3
  • Hmmm, I think your `std::array` should be `std::unique_ptr`. Quick summary: `vector` has variable size, `unique_ptr` has size determined at time of creation, `array` requires its size to be a compile-time constant. – Ben Voigt Aug 14 '20 at 21:04
5

An important point only briefly mentioned so far, is that if you require contiguous memory (like a C array gives), then you can only use vector, array, or string.

Use array if the size is known at compile time.

Use string if you only need to work with character types and need a string, not just a general-purpose container.

Use vector in all other cases (vector should be the default choice of container in most cases anyway).

With all three of these you can use the data() member function to get a pointer to the first element of the container.

Peter Mortensen
  • 30,738
  • 21
  • 105
  • 131
Jonathan Wakely
  • 166,810
  • 27
  • 341
  • 521
3

It all depends on what you want to store and what you want to do with the container. Here are some (very non-exhaustive) examples for the container classes that I tend to use most:

vector: Compact layout with little or no memory overhead per contained object. Efficient to iterate over. Append, insert and erase can be expensive, particularly for complex objects. Cheap to find a contained object by index, e.g. myVector[10]. Use where you would have used an array in C. Good where you have a lot of simple objects (e.g. int). Don't forget to use reserve() before adding a lot of objects to the container.

list: Small memory overhead per contained object. Efficient to iterate over. Append, insert and erase are cheap. Use where you would have used a linked list in C.

set (and multiset): Significant memory overhead per contained object. Use where you need to find out quickly if that container contains a given object, or merge containers efficiently.

map (and multimap): Significant memory overhead per contained object. Use where you want to store key-value pairs and look up values by key quickly.

The flow chart on the cheat sheet suggested by zdan provides a more exhaustive guide.

Bids
  • 2,422
  • 18
  • 26
  • "Small memory overhead per contained object" is not true for list. std::list is implemented as doubly linked list and so it maintains 2 pointer per stored object which is not to neglect. – Hanna Khalil Nov 16 '16 at 11:33
  • I would still count two pointers per stored object as "small". – Bids Nov 16 '16 at 16:18
  • compared to what? std::forward_list is a container that was mainly suggested to have less meta-data stored per object (only one pointer). While std::vector holds 0 meta data per object. So 2 pointers is not negotiable compared to other containers – Hanna Khalil Nov 17 '16 at 20:38
  • It all depends on the size of your objects. I've already said that vector has a "Compact layout with little or no memory overhead per contained object". I would still say list has a small memory overhead compared to set and map, and a slightly larger memory overhead than vector. I'm not really sure what point you are trying to make TBH! – Bids Nov 18 '16 at 12:44
  • All the mode based containers tend to have significant overhead due to dynamic allocation, which rarely comes for free. Unless of course you are using a custom allocator. – MikeMB Aug 27 '17 at 05:52
  • Again, it all depends what you are storing. In my experience (which is mostly 3D scientific imaging applications where we are dealing with multi-GB datasets), it is very rare that dynamic memory allocation speed becomes the bottleneck. Choosing an STL container based on an assumption of dynamic memory performance without profiling is almost certainly going to be premature optimisation. If and when you have a performance issue, you need to profile to find out what your bottleneck is, and only then start optimising. – Bids Aug 29 '17 at 08:31
2

One lesson I've learned is: Try to wrap it in a class, since changing the container type one fine day can yield big surprises.

class CollectionOfFoo {
    Collection<Foo*> foos;
    .. delegate methods specifically 
}

It doesn't cost much up front, and saves time in debugging when you want to break whenever somebody does operation x on this structure.

Coming to selecting the perfect data structure for a job:

Each data structure provides some operations, which can be varying time complexity:

O(1), O(lg N), O (N), etc.

You essentially have to take a best guess, on which operations will be done most, and use a data structure which has that operation as O(1).

Simple, isn't it (-:

Dominic Hofer
  • 5,751
  • 4
  • 18
  • 23
vrdhn
  • 4,024
  • 3
  • 31
  • 39
  • 5
    Isn't this why we use iterators? – Platinum Azure Mar 20 '12 at 14:20
  • @PlatinumAzure Even iterators should be member typedef .. If you change the container type you also have to go and change all the iterator definitions ... this did got fixed in c++1x though ! – vrdhn Mar 20 '12 at 16:13
  • That may be true, but the `.begin()` and `.end()` methods (and their ilk) should be reasonably immune to that change. – Platinum Azure Mar 20 '12 at 17:18
  • 4
    For the curious one, this is the fix in C++11: `auto myIterator = whateverCollection.begin(); // <-- immune to changes of container type` – Black Apr 13 '12 at 13:55
  • 1
    Would a `typedef Collection CollectionOfFoo;` be sufficient? – Craig McQueen Apr 12 '13 at 01:13
  • 5
    It's quite unlikely that you can just change your mind later and simply delegate to a different container: [Beware the illusion of container-independent code](http://ptgmedia.pearsoncmg.com/images/0201749629/items/item2-2.pdf) – fredoverflow Apr 29 '14 at 06:38
  • @Vardhan Member `using`/`typedef` is preferable to the entirely new class you proposed here. This is precisely why there's a defined spec for methods in STL containers. typedefs and `auto` are here to help. – underscore_d Nov 28 '15 at 13:05
2

I answered this in another question which is marked as dup of this one. But I feel that it is nice to refer to some good articles regarding the decision to choose a standard container.

As @David Thornley answered, std::vector is the way to go if there are no other special needs. This is the advice given by the creator of C++, Bjarne Stroustrup in a 2014 blog.

Here is the link for the article https://isocpp.org/blog/2014/06/stroustrup-lists

and quote from that one,

And, yes, my recommendation is to use std::vector by default.

In the comments, user @NathanOliver also provides another good blog, which has more concrete measurements. https://baptiste-wicht.com/posts/2012/12/cpp-benchmark-vector-list-deque.html .

CS Pei
  • 10,869
  • 1
  • 27
  • 46
1

I expanded on Mikael Persson's fantastic flowchart. I added some container categories, the array container, and some notes. If you'd like your own copy, here is the Google Drawing. Thanks, Mikael for doing the groundwork! C++ Container Picker

John DiFini
  • 96
  • 2
  • 6