0

Say I have a situation where I'm populating a vector/list iteratively, and I don't know how many elements there will be beforehand. For example, the code might look more or less like this:

std::vector<MyClass> mythings;
for(int i = 0; i < *some number*; ++i){
    if(*some condition here*){
        mythings.push_back(MyClass());
    }
}

I know that push_back isn't very efficient when used for std::vector. So in the above chunk of code I could change mythings to a std::list, and presumably that would be faster, since std::list is implemented as a linked list. However, say that I know that I'll need to do random access on the list/vector I just created. For random access, std::vector would be much faster. So using std::vector and std::list both have their disadvantages in this scenario.

In this situation, would it be more efficient for me to use a std::list when creating the list/vector, and then convert it to a std::vector afterwards (for example, as in this question: One liner to convert from list<T> to vector<T>) so that I can take advantage of the random access of std::vector? Or would the cost of the conversion outweigh the gain?

dfriend21
  • 89
  • 7
  • 1
    If `*some number*` is known, using `std::vector` with pre-allocation via `reserve()` may be better. – MikeCAT Feb 20 '21 at 00:40
  • 2
    How about doing some experiments? – MikeCAT Feb 20 '21 at 00:41
  • 1
    Why isn't `push_back` efficient? Is this code really your program's bottleneck after profiling? If not, you may be prematurely trying to optimize it. Approximately how many elements are we talking about with `* some number *` here? – ggorlen Feb 20 '21 at 00:42
  • @ggorlen in my situation `*some number*` could be as big or bigger than 1,000,000, and the condition likely won't be true very often - which is why I'm hesitant to pre-allocate as suggested by MikeCAT – dfriend21 Feb 20 '21 at 00:49
  • @dfriend21 So, you **do** know `*some number*` beforehand? – eerorika Feb 20 '21 at 00:54
  • 2
    It depends. Possible things to try are preallocating the vector, using `emplace_back` instead of `push_back` and/or using `std::move` to add to `mythings`, use `std::deque`, or iterating thru once to count the number of things that will be added, preallocating, then iterating thru to add them. – 1201ProgramAlarm Feb 20 '21 at 00:54
  • 1
    @dfriend21 I assume `push_back` forces a non-preallocated vector to make at most log(n) reallocation calls. Is 20 reallocations really destroying your app's performance that badly? Recommended reading: [Eric Lippert: performance rant](https://ericlippert.com/2012/12/17/performance-rant/) – ggorlen Feb 20 '21 at 00:56
  • @eerorika yes, but I don't know how often the condition will be `true` – dfriend21 Feb 20 '21 at 00:57
  • @ggorlen Honestly in my situation I don't think it'll make a huge difference either way (although I'd need to run tests to be sure). It was just something that occurred to me, and it seemed like an interesting concept. Also, I'm no C++ expert so I wasn't sure if it was a stupid idea or not, so I thought I'd run it past people who know more than I do - it sounds like the answer is "it depends" - so at least I know it wasn't a totally absurd idea :) – dfriend21 Feb 20 '21 at 01:03
  • 1
    Re: "presumably [`std::list`] would be faster" for insertions -- not necessarily. Be careful about making assumptions based on simplifications. `std::list` does a fair amount of work for **every** insertion, while `std::vector` does a lot of work **once in a while**. If speed of insertions is the most important factor in your application, measure the times involved. If speed of insertions is not a limiting factor for performance, just use `std::vector` so you can get random access. – Pete Becker Feb 20 '21 at 15:57

1 Answers1

1

In this situation, would it be more efficient for me to use a std::list

Probably not. This would typically be more expensive than pushing to the vector directly.

Or would the cost of the conversion outweigh the gain?

The high cost of the list in general is mostly the problem.


To find out whether these reasonable assumptions are actually true, you should measure them. If you measure one to be significantly faster, then that may be the more efficient choice.

Bonus hint: Don't push_back, but emplace_back. That can be a bit faster depending on the class.

Here are alternatives that I recommend measuring:

  • Just a simple vector and emplace.
  • Reserve vector with *some number*, shrink_to_fit after filling the vector.
  • Maybe try deque instead.
eerorika
  • 232,697
  • 12
  • 197
  • 326