Which is the fastest STL data structure in C++ to iterate over a large number of elements?

Question

The elements follow a certain property wherein each element is related to its previous element by a certain complex relation (assume that no simple formula is known to find out the ith element) I want to calculate the value of a certain element of this series for which I have to calculate the value of each element and in this process iterate over the complete list of elements.

I used an array, it probably caused some memory allocation error (SIGABRT). I used a vector, same error. maybe because the number of elements is huge and they are stored contiguously. Then I used a list, no such error but it is taking longer than the acceptable time limit.

Nothing in your description states why you need to store the whole sequence (as opposed to calculate it). If you explained why that is then you might get a better answer. — john, Sep 10 '15 at 07:53
A list actually takes more memory, but not contiguously. Still, in this age of 64 bits computing the contiguous memory limitation is effectively gone. You _did_ compile for 64 bits, right? — MSalters, Sep 10 '15 at 07:55
*"I used a vector, same error."* - *how* did you use a `vector`? If you can `reserve()` sufficient capacity up front, you avoid potentially costly resizes involving a momentary need for around 3 times the memory current elements use (with 2/3rds of that needing to be contiguous). If you can't reserve reasonably tightly up front, then songyuanyao's suggestion's good as `deque` allocates in smaller chunks. — Tony Delroy, Sep 10 '15 at 08:04
You could store intermediate values and interpolate the missing ones by calculating from their nearest stored neighbour. — Galik, Sep 10 '15 at 08:06
[Is there ever a reason to use std::list](http://stackoverflow.com/q/18449038/995714). [Number crunching: Why you should never, ever, EVER use linked-list in your code again](https://kjellkod.wordpress.com/2012/02/25/why-you-should-never-ever-ever-use-linked-list-in-your-code-again/) — phuclv, Sep 10 '15 at 08:14
@john thanks for your comment. as you said, storing the whole list will not be necessary for calculating a particular element's value. — anukul, Sep 10 '15 at 08:21
@MSalters I'm running the code on an online judge. Neither do I have the input cases that it is being run on, nor can I choose the compilation flags. — anukul, Sep 10 '15 at 08:22
Perhaps a small example sample would help? Without seeing what it is exactly that you are doing you're going to get a much weaker response. — David, Sep 10 '15 at 08:47

score 8 · Accepted Answer · answered Sep 10 '15 at 07:53

8

You could try std::deque

As opposed to std::vector, the elements of a deque are not stored contiguously: typical implementations use a sequence of individually allocated fixed-size arrays.

The complexity (efficiency) of common operations on deques is as follows:
Random access - constant O(1)
Insertion or removal of elements at the end or beginning - constant O(1)
Insertion or removal of elements - linear O(n)

answered Sep 10 '15 at 07:53

songyuanyao

169,198
16
310
405

2

Damn I was about to say that ;-) – Superlokkus Sep 10 '15 at 07:54
If you benchmark sequential access of `vector` vs `deque`, `vector` is going to win. – Maxim Egorushkin Sep 10 '15 at 08:02
@MaximEgorushkin I know it, but OP said explicitly he can't use array or vector (might because of the contiguous storage), but he wants something faster than list. – songyuanyao Sep 10 '15 at 08:03
@songyuanyao OP said it crashes for unknown reason. – Maxim Egorushkin Sep 10 '15 at 08:10
@MaximEgorushkin That will be another question...Why my code crashed when using array or vector? – songyuanyao Sep 10 '15 at 08:15
@AnukulSangwan Run it under a debugger to see what causes `SIGABRT`. It can be an `assert` or an unhandled exception. – Maxim Egorushkin Sep 10 '15 at 08:29
@AnukulSangwan Usually it means `abort()` was called, by some internal error or some seriously broken constraint. – songyuanyao Sep 10 '15 at 08:32
I guess it depends how the list is being formed and updated. If you're not having to think about that, vector sounds like the way to go. See if you can find and fix the error before looking at alternatives. – David Sep 10 '15 at 08:45

Maxim Egorushkin · Answer 2 · 2015-09-10T07:57:52.193

Modern CPUs are best at sequential access. You get the fastest access times when iterating over arrays in sequence in both directions because hardware prefetchers are designed to recognize such access pattern and prefetch the data into L1 cache hopefully before you start accessing it.

In other words, std::vector<> and std::array<> are the fastest in terms of sequential iteration.

See "Intel® 64 and IA-32 Architectures Optimization Reference Manual" 2.2.5.4 Data Prefetching for more details:

... The goal of the prefetchers is to automatically predict which data the program is about to consume. If this data is not close-by to the execution core or inner cache, the prefetchers bring it from the next levels of cache hierarchy and memory. Prefetching has the following effects:

• Improves performance if data is arranged sequentially in the order used in the program.

• May cause slight performance degradation due to bandwidth issues, if access patterns are sparse instead of local.

• On rare occasions, if the algorithm's working set is tuned to occupy most of the cache and unneeded prefetches evict lines required by the program, hardware prefetcher may cause severe performance degradation due to cache capacity of L1.

Data Prefetch to L1 Data Cache

Data prefetching is triggered by load operations when the following conditions are met:

• Load is from writeback memory type.

• The prefetched data is within the same 4K byte page as the load instruction that triggered it.

• No fence is in progress in the pipeline.

• Not many other load misses are in progress.

• There is not a continuous stream of stores.

Which is the fastest STL data structure in C++ to iterate over a large number of elements?

2 Answers2