6

I have been running a c++ program for scientific purpose and I am now looking at optimizing it.

The bottleneck seems to be a function where I need to stack pairs of integers. Their number is impossible to know from the start, and I have been using a std::vector of a custom struct holding two ints. Is there a more efficient data container for repetitively adding elements at the end? Should I use it with two ints instead of a pair or custom structure?

edit: After timing and profiling my program, I can say that, for my use, vector is slightly faster than deque (by only 3%). My layman conclusion is that the CPU makes a good use of the contiguity of the data. Optimization is more than ever a sorcery for me! And to those it may help: I actually improved significantly my running time by switching from the STL C++ 11 random number generator to the BOOST one.

Learning is a mess
  • 7,479
  • 7
  • 35
  • 71
  • 2
    Related: http://stackoverflow.com/questions/471432/in-which-scenario-do-i-use-a-particular-stl-container – EdChum Jun 09 '15 at 15:07
  • A fast one `std::list` – Bastien Jun 09 '15 at 15:07
  • 4
    How do you need to access the data? Do you need to access it from the front, back, or in between? This information is important to determining the container to use. – Joseph Hutchinson Jun 09 '15 at 15:09
  • 3
    There's `std::pair` for holding a pair of integers. You don't need to make your own struct to hold that. As for picking a data structure for fast appending, there are whole volumes on that. Ultimately, though, the theoretically fast (e.g. linked list) can be destroyed by modern CPU caching strategies, so you'll probably have to experiment with various containers and benchmark until you get a solution you're happy with. – Andrew Jun 09 '15 at 15:11
  • 2
    Do you have a good guess as to how many elements you will have? If you do you can reserve that much and then you will only get a reallocation in vector once you grow beyond that. Since most implementations grow geometrically that should cut down on the number of re-allocations. – NathanOliver Jun 09 '15 at 15:12
  • 4
    `std::vector>` is good, try using `reserve` strategically with known resource estimates, and avoid destroying and reconstructing the vector. – Kerrek SB Jun 09 '15 at 15:13
  • 1
    Using `emplace_back(a, b)` can be considerably faster than `push_back(mystruct(a, b))` in case you're not using it. – Galik Jun 09 '15 at 15:34
  • @KerrekSB: Is there that much overhead in destroying/constructing the vector? compared to emptying it? – Learning is a mess Jun 09 '15 at 15:59
  • 3
    @bastien Not necessarily. An `std::list` will incur dynamic memory allocations for each node, and your data may be scattered all over your memory. `std::vector` has an amortized O(1) insertion at the end of the vector, and you'll have better memory locality when accessing the elements in order. (CPUs love accessing memory in order....) – Andre Kostur Jun 09 '15 at 17:23
  • @AndreKostur /bastien Rather than "not necessarily" I'd say "almost certainly not." – gd1 Jun 09 '15 at 18:04
  • @Learningisamess : indeed. Emptying a vector usually involves no deallocation of the storage memory and if the elements of the vector are POD types, the vector doesn't even have to call their destructors. Emptying a vector of integers is simple as setting its element count to zero. – gd1 Jun 09 '15 at 18:07
  • 1
    @gd1: You are totally right, I had to check that the capacity of `vector` does not change after clearing to convince myself. And my programs runs faster now that I am not constructing/destructing it at each call. Thank you for pointing this out! – Learning is a mess Jun 11 '15 at 14:21
  • 1
    @Learningisamess : KerrekSB deserves some credit for suggesting that idea. Anyway yes, I think that a `std::vector` of integers (or PODs) is some of the most useful data structures available. It's damn fast if you reuse it. – gd1 Jun 11 '15 at 14:35

2 Answers2

10

The cost of the logarithmic number of reallocations you have to do is arguably irrelevant. However, you may consider using a std::deque that guarantees O(1) insertion at the front and at the end. You would lose contiguity, but some cache friendliness is kept. A deque is usually a decent trade-off, especially if you need to pop from the front.

Consider also estimating the number of elements the vector will store, and using reserve. Be careful however not to waste too much memory or you'll get the opposite effect.

gd1
  • 11,300
  • 7
  • 49
  • 88
  • 2
    A nice study on this : http://www.codeproject.com/Articles/5425/An-In-Depth-Study-of-the-STL-Deque-Container – norisknofun Jun 09 '15 at 15:14
  • 1
    Thanks for the suggestion, I will benchmark both and see if there is improvement. I had though about calling reserve, but then I realized that I had no idea about the distribution or mean of the final size, and using a very large value was inefficient when timed. Many thanks!! – Learning is a mess Jun 09 '15 at 15:57
  • 1
    Remember to follow @KerrekSB advice as well, i.e. reuse the vector if possible. – gd1 Jun 09 '15 at 16:07
2

As already mentioned by gd1, a std::deque (double-ended queue) allows fast insertion and deletion at both ends. Additionally, insertion and deletion at either end of a deque never invalidates pointers or references.

In your case you could for example use std::deque with std::pair (defined in <utility>) to suit your needs:

// Small example:
deque<pair<int, int>> deq;
deq.push_back({1234, 4321});
cout << deq.at(0).first << " " << deq.at(0).second;

// using 'at' above, should normally be inside a try block as it may throw 
// out_of_range exception
Andreas DM
  • 10,685
  • 6
  • 35
  • 62