3

I have the following code to profile std::vector performance vs std::list for various N.

void vectorPerf(size_t n)
{
   std::chrono::high_resolution_clock::time_point t1 = std::chrono::high_resolution_clock::now();
   std::vector<size_t> cache;
   for (size_t i = 0; i < n; i++) {
      cache.push_back(i);
   }
   std::chrono::high_resolution_clock::time_point t2 = std::chrono::high_resolution_clock::now();

   auto duration = std::chrono::duration_cast<std::chrono::microseconds>(t2-t1).count();

   std::cout << duration << " us." << "\n";

   return;
}

void listPerf(size_t n)
{
   std::chrono::high_resolution_clock::time_point t1 = std::chrono::high_resolution_clock::now();
   std::list<size_t> cache;
   for (size_t i = 0; i < n; i++) {
      cache.push_back(i);
   }
   std::chrono::high_resolution_clock::time_point t2 = std::chrono::high_resolution_clock::now();

   auto duration = std::chrono::duration_cast<std::chrono::microseconds>(t2-t1).count();

   std::cout << duration << " us." << "\n";

   return;
}

int main(int argc, char** argv)
{

   if (argc != 2) {
      return -1;
   }
   std::stringstream ss(argv[1]);
   size_t n;
   ss >> n;
   vectorPerf(n);
   std::cout << "\n";
   listPerf(n);
   std::cout << "\n";
}

For multiple runs of the same set of N, I get results where the results are consistently of the same order as the numbers below:

N       std::vector  std::list
1000    46           116
2000    85           217
5000    196          575
10000   420          1215
100000  3188         10497
1000000 34489        114330

What I am wondering is how come the performance of std::list is consistently worse then std::vector. For std::vector I would have expected the performance to be amortized O(N), because the internal object underpinning std::vector would need to be resized time to time. Since all I am doing is inserting an element to the end of the list I would have expected std::list to be O(1). So that would suggest that std::list would do better than std::vector but the opposite seems to be true.

I would appreciate if someone can shed some light on why this is happening. I am compiling on MacOS 10.12.6 using g++ on a 2015 MBP.

Thanks.

EDIT: I now understand that std::vector::push_back() is amortized O(1). However, what is not clear to me is why does std::list::push_back() perform consistently worse than std::vector::push_back()?

Jai Prabhu
  • 245
  • 3
  • 10
  • Thanks. The links suggests std::vector::push_back() is amortized O(1). That still does not explain to me why std::list::push_back() is so much worse than std::list::push_back() – Jai Prabhu Dec 26 '17 at 01:34
  • 9
    Because every insert to a list requires a memory allocation, while adding to a vector usually does not. – 1201ProgramAlarm Dec 26 '17 at 01:37
  • @1201ProgramAlarm An obvious optimisation for a list is to remove that requirement for a memory allocation, I would guess, but with a list there are also additional pointers to the next element that need to be set every time. Try your program with `std::vector/list cache(n)` to eliminate memory allocations for comparison. – Ken Y-N Dec 26 '17 at 01:43
  • 2
    One of the reason to have different containers is because then don't have the same performance for different operation. So depending on the usage (and the size of the data), one has to select the appropriate container. The simple rule is to use `std::vector` in most situations. – Phil1970 Dec 26 '17 at 01:53
  • Look at the ratio between the time taken, the approximated time ratio between vector and list stays pretty much the same around 1:3. So, it is not a problem of complexity but rather the constant which Big-O notation ignores. – lamandy Dec 26 '17 at 02:03
  • 4
    The effects of repeated `push_back()`s are that the code ends up writing to consecutive memory addresses. That's what a vector is all about. Modern CPUs are just very, very good at accessing consecutive memory addresses, in sequence. With a std::list, each element ends up getting allocated in random chunks of memory, hence the performance profile is worse. – Sam Varshavchik Dec 26 '17 at 02:07
  • 1
    Here are some more benchmarks: https://baptiste-wicht.com/posts/2012/12/cpp-benchmark-vector-list-deque.html – drescherjm Dec 26 '17 at 02:19
  • @JaiPrabhu I'm looking, but I don't see the compiler options you used. Did you run an optimized build? When posting questions concerning performance, one of the requirements should be the compiler options used – PaulMcKenzie Dec 26 '17 at 02:41
  • @JaiPrabhu [Different results here](http://coliru.stacked-crooked.com/a/a6ecc388e979d996). Did you specify `-O2` or `-O3` on the command-line, thus turning on optimizations? – PaulMcKenzie Dec 26 '17 at 02:47
  • @PaulMcKenzieI did not use optimization. Point taken about posting compiler options. I only specified C++11 (--std=c++11) as an option. In your results too the vector performs better than list, right? – Jai Prabhu Dec 26 '17 at 03:59

1 Answers1

0

vector insert to the end is amortized constant (that is amortized O(1)) rather than amortized O(n), where the list is always O(n), vector capacity grows faster than its actual size in this circumstance, it allocates extra space and then fills as it goes so an allocation is not needed for every push_back.

SoronelHaetir
  • 14,104
  • 1
  • 12
  • 23
  • 3
    From cppref.com: "std::list is a container that supports constant time insertion and removal of elements from anywhere in the container." Does this contradict your "list is always O(n)" statement? – 2785528 Dec 26 '17 at 02:11