Why might std::vector be faster than a raw dynamically allocated array?

Question

The result of a discussion with a colleague I ended up writing benchmarks to test std::vector vs raw dynamically allocated arrays, and ended up with a surprise.

My tests are as follows:

#include "testconsts.h" // defines NUM_INTS across all tests
#include <vector>

int main()
{
    const int numInts = NUM_INTS;
    std::vector<int>                     intVector( numInts );
    int * const                          intArray       = new int[ numInts ];

    ++intVector[0]; // force access to affect optimization
    ++intArray[0];  // force access to affect optimization

    for( int i = 0; i < numInts; ++i )
    {
        ++intArray[i];
    }

    delete[] intArray;
    return 0;
}

and:

#include "testconsts.h" // defines NUM_INTS across all tests
#include <vector>

int main()
{
    const int numInts = NUM_INTS;
    std::vector<int>                     intVector( numInts );
    int *                                intArray       = new int[ numInts ];

    ++intArray[0];  // force access to affect optimization
    ++intVector[0]; // force access to affect optimization

    for( int i = 0; i < numInts; ++i )
    {
        ++intVector[i];
    }

    delete[] intArray;
    return 0;
}

They are compiled with g++ -O3 with gcc 4.4.3

The results of multiple runs of benchmarking using time are similar to:

Array:

real    0m0.757s
user    0m0.176s
sys     0m0.588s

Vector:

real    0m0.572s
user    0m0.268s
sys     0m0.304s

Three things are clear:

Array is faster in user time
Vector is faster less system time
Over all vector won this fight

The question is "why?".

The system time issue I'm guessing must have to do with page faults, but I can't describe for myself exactly why one would have significantly more page faults.

As for the user time issue, it's less interesting to me, but I'm still curious of opinions on that as well. I had imagined it had something to do with initialization, though I'm not passing an initialization value to the vector constructor so I don't know.

Try it with vectors of other types... It might have to do with an optimization of the vector class for 32-bit ints. They did it for bools. — Patrick87, Jul 25 '11 at 23:04
possible duplicate of [Using arrays or std::vectors in C++, what's the performance gap?](http://stackoverflow.com/questions/381621/using-arrays-or-stdvectors-in-c-whats-the-performance-gap) — Mark B, Jul 25 '11 at 23:07
@Thomas: that's just sooo wrong. vectors internally use an allocated space used as an array. vector is an entirely different animal, a design mistake — Karoly Horvath, Jul 25 '11 at 23:11
@Mark B. In what manner would this be a duplicate? That is a general question asking about the difference between the two and suggesting that practically there is none. Mine is: I expect that there is not difference, but I'm seeing one. — Catskul, Jul 25 '11 at 23:13
How about comparing the assembly? That should give you a pretty good idea. — Kerrek SB, Jul 25 '11 at 23:20
@Kerrek Not a bad suggestion to help answer the user time issue, but doesn't help answer the system time question. — Catskul, Jul 25 '11 at 23:26
Catskul: Well, if the assebly happened to be identical, then you'd know that the discrepancies are because of external factors and not because of your code. — Kerrek SB, Jul 25 '11 at 23:31
@Catskul: you already know the answer: more page faults, you just need the explanation of why so, and you don't need to go down to assembly, the interpretation of high level C++ suffices: `std::vector` default initializes the elements on construction, while the `new` in your code does not. The array test is paying the cost of both containers, while the vector test has only its own cost. — David Rodríguez - dribeas, Jul 25 '11 at 23:32

score 11 · Accepted Answer · edited Jul 26 '11 at 01:13

The difference is not in the performance of the vector compared to the dynamic array, but in the number of accesses to memory that you perform.

Effectively, in the vector test you are re-accessing cached memory, while in the array version you don't. You pay the price of caching the vector version in either case.

In the vector test, you allocate the dynamic memory for the array but leave it untouched, with the memory never being touched there are no page faults due to that operation. The vector is created, initialized and then the second pass will be accessing already cached data (if the size fits the cache, if it does not, it will not be in cache, but the same cost will be incurred in both versions).

On the other hand when testing the array, the vector constructor initializes the elements, and that means that in the case were you are trying to profile the behavior of the array, the vector contents are walked over and the array elements are walked over. Double the number of memory accesses, page faults and memory used by the application.

You can try modifying the code so that the dynamic allocation is performed like this:

int * intArray = new int[ numInts ](); // note extra ()

Which will value-initialize the whole array, or you initialize the array contents else how. The results of running that modified version of the test should be similar.

It took me a few reads to understand, but re-benchmarking shows that you are correct. I hope you don't mind if I edit your answer to help clarify. — Catskul, Jul 25 '11 at 23:39
@Catskul: I don't mind, I wrote it late at night and the problem is complex enough. The point is: in the vector benchmark, you only touch the vector contents, in the array benchmark, you touch the array and the vector contents, so there is twice the amount of memory accessed with all issues that that brings up.. — David Rodríguez - dribeas, Jul 26 '11 at 07:09

score 3 · Answer 2 · answered Jul 25 '11 at 23:00

3

Have you run the test more than once? Benchmarking is a hard process, and has to rely on averages to get any kind of meaningful result; it's possible that at the time you were running your array benchmark a few CPU cycles were dedicated to something else, slowing it down. I would expect that given enough results they would be similar, as std::vector is written with a C-style array at it's core.

answered Jul 25 '11 at 23:00

Thomas Russell

5,870
4
33
68

1

Perhaps this should be a comment – Nick Rolando Jul 25 '11 at 23:02
@Nicklamort: Ahh okay, sorry, I was unsure whether to put as a comment or answer. – Thomas Russell Jul 25 '11 at 23:03
3

seems a valid answer to me although it could be a bit more firm. yes, they should have the same performance. – Karoly Horvath Jul 25 '11 at 23:08
They have been run 10s of times with the same result. Good thought though. I'll update to mention that. – Catskul Jul 25 '11 at 23:14

Why might std::vector be faster than a raw dynamically allocated array?

2 Answers2

Related