16

After having read this question and looking at some results here, it seems like one should altogether completely avoid lists in C++. I always expected that linked lists would be the containers of choice for cases where I only need to iterate over all the contents because insertion is a matter of pointer manipulation and there is never a need to reallocate.

Apparently, because of "cache locality," lists are iterated over very slowly, so any benefit from having to use less reserve memory or faster addition (which is not that much faster, it seems, from the second link) doesn't seem worth it.

Having said that, when should I, from a performance standpoint, use std::list over std::deque or, if possible, std::vector?

On a side note, will std::forward_list also have lots of cache misses?

Community
  • 1
  • 1
VF1
  • 1,594
  • 2
  • 11
  • 31
  • 7
    You can use a list even with uncopyable, unmovable types (and still be able to insert in the middle). – Kerrek SB Aug 26 '13 at 16:51
  • Take a look here http://stackoverflow.com/questions/471432/in-which-scenario-do-i-use-a-particular-stl-container – olevegard Aug 26 '13 at 16:52
  • 3
    @Borgleader: That is just a theoretical question, which need not be appropriate for practical implementations. That is, on paper insertion in the middle of a list is `O(1)` and insertion in the middle of a vector is `O(n)`, so the first should be preferred... except that the hidden constant in `O(1)` might be larger than `O(n)`. Quoting Stepanov `O(log N)` for a couple hundred million elements is a small number, so you can call it constant time with a somehow large factor. The same applies here... `N*very_small` might be smaller than `1*very_large` – David Rodríguez - dribeas Aug 26 '13 at 16:58
  • I'm using `std::list` in a situation where the allocated objects need to be pinned to their memory location with a custom allocator that adds metadata to each link. It's probably the only time I don't think I could have used something else instead. – zneak Aug 26 '13 at 17:03
  • @DavidRodrigues given your presence [here](http://stackoverflow.com/questions/322715/when-to-use-linkedlist-over-arraylist), do you mind answering if you think this applies to Java? At that high of a level, does `LinkedList<>` introduce additional iteration cost? – VF1 Aug 26 '13 at 17:07
  • @DavidRodríguez-dribeas Some of the answers, in particular the accepted one about concurrent data structures, is quite practical. It's just not really applicable to the specific linked list variant that is `std::list`, because some of its assumptions (including being C++, if you want to call that an assumption) and abstractions clash with the requirements for concurrent modifications. –  Aug 26 '13 at 17:08
  • 1
    VF1, In Java you have to opt between the bad and the worse, since none of those containers provide locality of data (Java using references for everything). If you care about locality of data, choose a different language. – David Rodríguez - dribeas Aug 26 '13 at 17:11
  • @DavidRodriguez I did, hehe... – VF1 Aug 26 '13 at 17:13
  • @delnan: Well, concurrency is out of the question with `std::list`, so it does not weight in the decision of `std::list` vs. `std::vector`, so again it is a theoretical question vs. a practical concrete one. – David Rodríguez - dribeas Aug 26 '13 at 17:15
  • 1
    @DavidRodríguez-dribeas You may have a different definition of theoretical. I'd say it is just not applicable here. I would not say it is a theoretical question, perhaps more *general* because it's about all kinds of linked lists in any kind of language, not about `std::list` in C++ specifically. As a consequence, some answers to *that* question don't apply to this question because they don't meet the criteria, but that doesn't make them theoretical, it just makes them not relevant. Other answers are pretty much like the answers we're seeing here, only less C++ specific in their terminology. –  Aug 26 '13 at 17:21
  • 1
    To add my two cents, I would also point out that one of the viable alternatives is also a vector of pointers (e.g., `std::vector>` or `boost::stable_vector`). This solves most of the expensive-to-copy issues while retaining random-access, and getting persistent positions (in actual memory). But you get locality of reference issues comparable to linked-lists, but less memory overhead. Also, most things that are expensive to copy and are cheap to move (C++11). I don't find `std::list` very useful at all in actual implementations, can't recall the last time I chose to use it. – Mikael Persson Aug 26 '13 at 17:23
  • @delnan: The tags in the other question are [language-agnostic] [data-structure] [linked-list]. That makes it theoretical. The first answer is (in a generic way) that there can be an impact on concurrent implementations, relevant in a theoretical way, not in a practical way for `std::list` (*If `std::list` were concurrent, that would be an advantage* it isn't). But I think we are just arguing the terms, if you prefer not applicable here, I am fine by that. – David Rodríguez - dribeas Aug 26 '13 at 17:46
  • 5
    From a performance standpoint, the best reason to use std::list is so that when later on your app is suffering badly for performance and needs "an optimziation pass", you can replace it with another data structure and look like a hero. – Crashworks Aug 26 '13 at 18:47
  • [Number crunching: Why you should never, ever, EVER use linked-list in your code again](https://kjellkod.wordpress.com/2012/02/25/why-you-should-never-ever-ever-use-linked-list-in-your-code-again/) – phuclv Sep 10 '15 at 08:15
  • i imagine that if you are doing active rendering where race to sleep is impossible, and doing a calculation which is done either once or on user demand, then using linked lists greatly simplify your code. that said they are one of the worst thing you can do inside loops unless they are stack linked lists that are linked with scopes in depth. When designing algorithms that insert values between elements, especially many at once, for maintainability, such as tessellation of triangles, lists get proofs of concepts done significantly faster. – Dmytro Nov 29 '17 at 16:48

4 Answers4

23

std::list is useful in a few corner cases.

However, the general rule of C++ sequential containers is "if your algorithm are compatible, use std::vector. If your algorithms are not compatible, modify your algorithms so you can use std::vector."

Exceptions exist, and here is an attempt to exhaustively list reasons that std::list is a better choice:

  1. When you need to (A) insert into the middle of the container and (B) you need each objects location in memory to be stable. Requirement (B) can usually be removed by having a non-stable container consisting of pointers to elements, so this isn't a strong reason to use std::list.

  2. When you need to (A) insert or delete in the middle of the container (B) orders of magnitude more often than you need to iterate over the container. This is also an extreme corner case: in order to find the element to delete from a list, you usually need to iterate!

Which leads to

  1. you need (A) insert or delete in the middle of the container and (B) have iterators to all other elements remain valid. This ends up being a hidden requirement of case 1 and case 2: it is hard to delete or insert more often than you iterate when you don't have persistant iterators, and the stability of iterators and of objects is highly related.

  2. the final case, is the case of splicing was once a reason to use std::list.

Back in C++03, all versions of std::list::splice could (in theory) be done in O(1) time. However, the extremely efficient forms of splice required that size be an O(n) operation. C++11 has required that size on a list be O(1), so splice's extreme efficiency is limited to the "splicing an entire other list" and "self-splicing a sublist" case. In the case of single element splice, this is just an insert and delete. In the case of sub range splice, the code now has to visit every node in the splice just to count them in order to maintain size as O(1) (except in self-splicing).

So, if you are doing only whole-list splices, or self-list subrange splices, list can do these operations much faster than other non-list containers.

Yakk - Adam Nevraumont
  • 262,606
  • 27
  • 330
  • 524
  • 3
    +1 for mentioning stable containers. And case 2-3 is quite a corner case indeed because for the most part this case describes a situation where you need individual dynamically allocated objects (i.e., "persistent iterators" become "(smart-)pointers"), with the only difference being the need for *occasional* traversals. These *occasional* traversals are rarely worth the overhead (previous / next pointers) as opposed to context-specific alternatives (e.g., traversing the objects that contain the persistent "pointers"). – Mikael Persson Aug 26 '13 at 17:38
19

When should I, from a performance standpoint, use std::list

From a performance standpoint, rarely. The only situation that comes to mind is if you have many lists that you need to split and join to form other lists; a linked list can do this without allocating memory or moving objects.

The real benefit of list is stability: elements don't need to be movable, and iterators and references are never invalidated unless they refer to an element that's been erased.

On a side note, will std::forward_list also have lots of cache misses?

Yes; it's still a linked list.

Mike Seymour
  • 249,747
  • 28
  • 448
  • 644
  • Like Yakk mentioned in his answer, if my design requires a container that keeps its objects in place, I should just be using pointers anyway (from both a practical implementation and design perspective). – VF1 Aug 26 '13 at 17:25
  • @VF1: Yes, there are other ways to get a container of stable objects (or pointers thereto). `list` has the slight advantage of managing the objects' lifetime, so you don't need to mess around with smart pointers or extra containers for the objects. – Mike Seymour Aug 26 '13 at 17:35
9

The biggest improvement that std::list can provide is when you're moving one or more elements from the middle of one list into another list. This splice operation is extremely efficient on list while it may involve allocation and movement of items in random access containers such as vector. That said, this comes of very rarely and much of the time vector is your best container in terms of performance an simplicity.

Always profile your code if you suspect performance problems with a container choice.

Mark B
  • 95,107
  • 10
  • 109
  • 188
  • Forgot about `splice`! Yes, I suppose for certain purposes, where an operation like that is needed, it would be better to use `list`. – VF1 Aug 26 '13 at 17:00
  • 2
    Splice is much slower than it could be in C++03, because `list` requires an O(1) `size` in C++11. Now, I'm not sure if anyone actually wrote an O(n) `size` list, but ... – Yakk - Adam Nevraumont Aug 26 '13 at 17:06
  • 2
    @Yakk: yes, such a shame too. `splice` was one of the best reason to use lists and it got crippled in C++11 *just* so that `size` could be O(1)... I would have much preferred removing `size` altogether or changing its name to avoid inadvertent use. – Matthieu M. Aug 26 '13 at 17:38
  • @MatthiewM. While it would have made the standard more verbose, a kind of `size` being dirty would solve the problem. That way, `splice` could be O(1), and `size` could be O(1), but `splice` *then* `size` would be O(n). The downside is that this O(n) cost would be born by both lists, and not limited to the amount of spliced data. I cannot think of a practical way to have a `size` with efficient lazy splice-accounting in it... – Yakk - Adam Nevraumont Aug 26 '13 at 19:04
7

You chose std::list over std::deque and std::vector (and other contiguous memory containers) when you frequently add/remove items in the middle of your container, see also.

Community
  • 1
  • 1
cpp
  • 3,743
  • 3
  • 24
  • 38
  • 4
    Although in that case, you should first consider whether you *need* to insert in the middle, or to preserve the order when you erase from the middle; and if you do, whether the container will be large enough that a list is actually faster. – Mike Seymour Aug 26 '13 at 17:07
  • 7
    Also, this is only true if you already have an iterator for the position where you want to add/remove, or have a cheap way to get one. –  Aug 26 '13 at 17:10