19

I'd like to use std::forward_list

Because:

Forward list is a container which supports fast insertion and removal of elements from anywhere from the container

But there's no *std::forward_list::push_back* implementation.

Is there a high-performance way to add support for the one or no reason to do it?

Griwes
  • 8,805
  • 2
  • 43
  • 70
Alexander Abashkin
  • 1,187
  • 4
  • 13
  • 18
  • 4
    Note that the bidirectional `std::list` also supports "fast insertion and removal of elements from anywhere from the container", and has `push_back`. The cost is an extra pointer per entry. Is memory so tight that you can't use that? – Mike Seymour Jan 05 '12 at 12:41
  • Why do you need it? Do you want to grow your list both ways? Can't you use `push_front()` just as easily? – Andreas Frische Jan 05 '12 at 12:37
  • I want to keep sorting the list – Alexander Abashkin Jan 05 '12 at 12:43
  • @Mike Seymour, I believed that the singly-linked list is much faster)) – Alexander Abashkin Jan 05 '12 at 12:50
  • 2
    @AlexanderGuiness: If you want it sorted, maybe `set` or `multiset` might be a better choice? That maintains a sorted order, at the cost of slower insertion and removal (although inserting a single element will be quicker than inserting it into a list then resorting). – Mike Seymour Jan 05 '12 at 12:50
  • 1
    @AlexanderGuiness: The singly-linked list will be marginally faster for the operations it supports (since each operation has to update half as many pointers), but both have the same complexity orders. It will most likely be slower if you want to support a fast `push_back`. – Mike Seymour Jan 05 '12 at 12:52
  • @Mike Seymour, Thanks, I figured, and I'll use a simple `std::list` – Alexander Abashkin Jan 05 '12 at 12:53
  • @Alexander - *"I want to keep sorting the list..."* - If you continually sort the list, then I believe you turn the list into a ***O(n^2)*** algorithm. You get ***O(1)*** or ***O(n)*** insertion, ***O(n log n)*** sort, and you do it ***n*** times. ***O(n log n)*** * ***O(n)*** is bounded by ***O(n^2)***. (And if I recall my algorithm analysis class correctly). – jww Dec 10 '18 at 06:50
  • @Alexander - *"I want to keep sorting the list..."* - If you continually sort the list, then I believe you turn the list into a ***O(n^2)*** algorithm. You get ***O(1)*** or ***O(n)*** insertion, ***O(n log n)*** sort, and you do it ***n*** times. ***O(n log n)*** * ***O(n)*** is bounded by ***O(n^2)*** (or is it ***O(n^2 log n)***). (And if I recall my algorithm analysis class correctly). – jww Dec 10 '18 at 06:55

5 Answers5

30

I recommend against std::forward_list just like I recommend against std::list in almost all situations. Personally, I've never found a situation in my code where a linked list was the best data structure.

In C++, your default go-to data collection should be std::vector. It gives you efficient push_back, if that is what you really need. It technically does not give you efficient deletion and insertion from the middle if you only look at abstract big-O complexity measurements of that one operation. In the real world, however, std::vector still wins even for inserting and deleting in the middle.

As an example, Bjarne Stroustrup created a test of a 100,000 element std::list vs. std::vector. He would search each for an element and delete it. Then he would find an insertion point and insert into the middle. He could have use a binary search on the std::vector, but did not to make the comparison 'more fair'.

The results show a strong win for std::vector, even in this situation where std::list is supposed to be strong. Simply traversing the std::list takes so much longer because of how far apart in memory all of the objects are. std::list is not cache-friendly, which is possibly the most important thing for modern processors.

The complete talk by Bjarne Stroustrup

Thorough explanation of the effects, with benchmarks at multiple sizes

Note that this second link here gives some situations of where you may possibly want to use a std::list, such as when the size of the elements is large. However, I've been in a situation where I have many elements in a particular order and needed to delete some.

These elements were larger than any built-in type, but not huge, perhaps 20-30 bytes each on a 32-bit computer). The number of elements was large enough so that my entire data structure was a few hundred MiB. The data collection was a set of values that could theoretically be a valid based on currently known information. The algorithm iterated over all elements and removed elements that could no longer be valid based on new information, with each pass probably deleting somewhere around 80% of the remaining elements.

My first implementation was a straightforward std::vector approach where I deleted invalid elements as I traversed. This worked for small test data sets, but when I tried to do the real thing, it was too slow to be useful. I switched to a std::list as the container, but used the same algorithm, and I saw significant performance improvements. However, it was still too slow to be useful. The winning change was to switch back to a std::vector, but instead of deleting elements in place that were bad, I created a new std::vector, and any elements I found that were good were put into that std::vector, and then at the end of the function I would simply discard the old std::vector and use the new one, and this gave me about as much of a speed up over the std::list as the std::list gave me over my original std::vector implementation, and this was just fast enough to be useful.

David Stone
  • 26,872
  • 14
  • 68
  • 84
  • 1
    Or.. maintain a "dead" flag on each element that you update instead of deleting elements, only compact the vector when the number of dead elements reaches a certain limit. Will not work with primitive types in the vector of course, but very interesting answer, ta. – gbjbaanb May 23 '14 at 07:46
  • 3
    Nice rant, but I don't see how this answers the question at all. There are many situations where a singly linked list will do fine; in fact I think any implementation behind the scenes uses essentially a singly linked list of frames to represent the runtime stack within a single thread (the frames may be consecutive in memory, but are not equal-sized so cannot be addressed directly as in a vector). Typically if you just need a simple stack or queue structure, you can implement that easily using a list (and an additional pointer to its end in case of a queue). – Marc van Leeuwen Jun 18 '14 at 09:13
  • 2
    @MarcvanLeeuwen It solves the problem rather than answering the question. The problem is that they want a high performance data structure for insertion and removal. `std::forward_list` is rarely the data structure to achieve goals this broad. As for your specific example, `std::forward_list` doesn't help there, either (unless I am misunderstanding, which I may well be). If the frames are not equal sized, that implies dynamic allocation. A linked list of pointers is almost never a better data structure than a vector of pointers. – David Stone Jun 20 '14 at 03:32
  • 1
    The very best use I've found for a single-linked list is for atomic add/removal of items. The pointers can be updated using cmpxchg. Although I don't believe the std::forward_list does so, sadly. – Zan Lynx Sep 26 '14 at 21:58
  • Yes, multi-threaded code is the one place where a good linked list data structure can be useful, but `std::forward_list` is not that data structure. – David Stone Sep 28 '14 at 15:50
  • 1
    That technique of traversing the container, copying into a new container only what is to be retained, is how the modern Java garbage collectors work. – Raedwald Dec 01 '17 at 14:59
23

std::forward_list supports fast insertion and removal, but not traversal to the end. To implement .push_back, you'll first need to get to the end of the list, which is O(N) and not fast at all, which is probably why it's not implemented.

 

You could find the iterator to the last element by incrementing .before_begin N times

auto before_end = slist.before_begin();
for (auto& _ : slist)
  ++ before_end;

and then use .insert_after or .emplace_after to insert the element:

slist.insert_after(before_end, 1234);
Mike Kinghan
  • 55,740
  • 12
  • 153
  • 182
kennytm
  • 510,854
  • 105
  • 1,084
  • 1,005
  • 19
    Of course, a `forward_list` could be amended by storing an extra iterator pointing to its end. That way, `push_back` could be done in O(1) time. – Fred Foo Jan 05 '12 at 12:35
  • So there's no any reason to implementation `push_back`? – Alexander Abashkin Jan 05 '12 at 12:36
  • 8
    @KennyTM It would still lack bidirectionality. – pmr Jan 05 '12 at 12:45
  • 2
    @KennyTM: no, `std::list` is doubly-linked. What I mean is store an extra pointer to the end in the head structure, exactly what Mike Seymour suggests. – Fred Foo Jan 05 '12 at 12:46
  • 1
    Or perhaps write `auto before_end = std::next(slist.before_begin(), std::distance(slist.begin(), slist.end()));` – ecatmur Apr 02 '13 at 14:23
  • @larsmans Why not just use the `queue` class instead of `forward_list`? – Logan Kling Jan 19 '16 at 01:06
  • std::forward_list is not *just* a singly-linked list. It is intended to also occupy the absolute minimum possible memory (in particular an empty list should be only one word). It doesn't even keep its own size, so it's certainly not going to waste space on tracking the tail of the list. – Chiara Coetzee Apr 15 '17 at 23:52
6

The point of std::forward_list is to be an ultra-stripped-down version of std::list, so it doesn't store an iterator to the last element. If you need one, you'll have to maintain it yourself, like so:

forward_list<int> a;

auto it = a.before_begin();

for(int i = 0; i < 10; ++i)
    it = a.insert_after(it, i);
Rick Yorgason
  • 1,616
  • 14
  • 22
5

There is no push_back because the list doesn't keep track of the back of the list, only the front.

You could write a wrapper around the list that maintains an iterator to the last element, and implements push_back using either insert_after or push_front depending on whether the list is empty. This will get rather complicated if you want to support the more complex operations (e.g. sort and splice_after).

Alternatively, if you don't need push_back to be fast, it's straightforward to do it in linear time.

Unless memory is extremely tight, the best solution is to use list. This has the same performance characteristics as forward_list, and is bidirectional, supporting push_back as well as push_front; the cost is an extra pointer per element.

Mike Seymour
  • 249,747
  • 28
  • 448
  • 644
  • +1, just my idea. `sort` wouldn't be so hard, btw.; just sort the underlying `forward_list` and find the new end. Since sorting is O(*n* lg *n*), it dominates the end-finding. – Fred Foo Jan 05 '12 at 12:37
  • @larsmans: That's true. It might be trickier to `splice` in constant time though (if you're splicing a normal `forward_list` that doesn't track its end). – Mike Seymour Jan 05 '12 at 12:39
  • 1
    I don't believe it's possible to have both constant-time `size` and constant-time `splice`, which is why there was variation in the complexity of the operations on a `std::list`. "One or the other, but not both". That's also why there was a bit of grumbling about `std::list` getting constant-time guarantees for `size` in C++11. – David Stone Jul 07 '12 at 15:21
4

Unfortunatelly I can't add a comment (low reputation) but I just wanted to mention that one of the forward_list & list advantages is that insert-delete operations do not invalidate iterators. I had an application where collection of elements was growing while iterating and processing individual elements. Absence of iterator invalidation allowed me to implement segment scan (begin of the list as start of the segment and begin of the last segment as end).

Oleg Karasik
  • 959
  • 6
  • 17