4

I was under the impression that std::vector is just a thin wrapper around dynamic arrays, and their performance should be comparable. A search on the Internet and stackoverflow itself gives the same answer as well. However, when I test it myself, I found a huge difference. The code is below. I tried both VC++ 2012 (release build) and MinGW with optimization flag -O2.

The time for new, malloc and calloc is around 0.005 sec, while std::vector takes 0.9 sec on both compilers. Is std::vector inherently slow or my code has some serious flaws?

#define _SCL_SECURE 0
#include <stdio.h>
#include <stdlib.h>
#include <vector>
#include <time.h>

struct timer
{
    clock_t start;
    timer()
    {
        start=clock();
    }
    ~timer()
    {
        double t=static_cast<double>(clock()-start)/CLOCKS_PER_SEC;
        printf("%lf\n", t);
    }
};

int main()
{
    const size_t len=(1<<28);   
    {
        timer t;
        int *d=new int[len];
        printf("%d\n",d[len-1]);//prevent compiler from optimizing away 
                                //useless allocation and deallocation
        delete[] d;
    }
    {
        timer t;
        int *d=(int *)malloc(len*sizeof(int));
        printf("%d\n", d[len-1]);
        free(d);
    }

    {
        timer t;
        std::vector<int> d(len);
        printf("%d\n", d[len-1]);
    }
    {
        timer t;
        int *d=(int *)calloc(len, sizeof(int));
        printf("%d\n", d[len-1]);
        free(d);
    }

    return 0;
}

EDIT 1

Per suggestion, I test for additional ways of creating dynamic array

  • new: 0.005
  • malloc: 0.005
  • calloc: 0.005
  • malloc+memset: 1.244
  • vector(len): 1.231
  • vector(len, 0): 1.234
  • vector.reserve(len): 0.005

So indeed the offender is the zero initialization instead allocation or deallocation. This means that if one needs an zero-initialized array, vector is not the way to go, even if it has a constructor that default initialized all elements.

Besides, this is not just something that pop out of my head. My final project for a class is graded on the time spent, and I used a several vectors to allocate a huge memory buffer instead of new for exception safety and because our textbook encourages use of STL. Not until today did I realize that I lost some points because of this. Sad day.

EDIT 2

Today I tried Clang 3.2 on Ubuntu 13.04 x64, and std::vector no longer takes that time to initialize. In fact, vector is now the fastest! Maybe this is a compiler optimization problem after all, not inherently in design of the std::vector.

Siyuan Ren
  • 7,573
  • 6
  • 47
  • 61
  • 2
    shouldn't you be running the tests through some 1000's of iterations (if not more)? – Son-Huy Pham Jun 27 '13 at 15:30
  • 1
    It's [even worse](http://coliru.stacked-crooked.com/view?id=796b97719ad2906d4a7df8d7faa1af8b-3b440a87a52fe2ae7c853c82f4c5144f) on GCC 4.8.1 with -O3. 0.000000 for everything except the vector, which is 4.09 (and a decently lower time the second time run, which is the link). – chris Jun 27 '13 at 15:32
  • 2
    You should be comparing code that does the same (i.e. did you read vector docs?). You should be comparing code that doesn't have undefined behaviour. – R. Martinho Fernandes Jun 27 '13 at 15:38
  • 4
    The important difference between the `vector` and the other code is that the `vector` is zero-initialising the memory, where the other code isn't (apart from the `calloc`). If you replace the `new` with `new int[len]()`, it takes as long as the vector. I assume the `calloc` is optimised in some way (for example, it could request zeroed memory, rather than explicitly zeroing the memory itself). – Mankarse Jun 27 '13 at 15:41
  • The allocation of the array is generally the least recently used operation when working with it. – Chad Jun 27 '13 at 15:41
  • What do you think `#define _SCL_SECURE 0` does? – James McNellis Jun 27 '13 at 15:50
  • `calloc` is in libc and has internal knowledge about the allocation that lets it skip zeroing in some cases. `std::vector` doesn't have that knowledge. And although a compiler could possibly replace malloc+memset with calloc, it can't do the same with new+memset. – Marc Glisse Jun 27 '13 at 16:04

2 Answers2

4

I believe this is due to allocation of std::vector invoking a copy constructor on each element, where malloc just returns uninitialized memory.

std::vector<int> x(100); 

is effectively the same as:

std::vector<int> y(100, int()); 

See the documentation on the constructor for std::vector http://en.cppreference.com/w/cpp/container/vector/vector

I often will do something like this:

std::vector<int> x; 
x.reserve(100);
// Insert items into x via x.push_back()
JoeD
  • 171
  • 6
  • 7
    In C++11 (and Visual C++ 2012), `std::vector x(100);` is _not_ the same as `std::vector y(100, T());`. The former default constructs 100 elements; the latter copy constructs 100 elements from the given `T()` argument. (The _effect_ is the same for `int`, but not for all types.) – James McNellis Jun 27 '13 at 15:47
  • is `new` for primitive types the same as `malloc` or `calloc`? – Siyuan Ren Jun 29 '13 at 10:06
  • @JamesMcNellis: Now that I have more understanding of the quirks of C++, I find a mistake in your wording. In C++11, `std::vector` *value* initialize its contents, not *default* initialize. – Siyuan Ren Apr 19 '14 at 08:29
3
printf("%d\n",d[len-1]);//prevent compiler from optimizing away 

This line reads from an uninitialised object. Instead of preventing the compiler from optimising things, it gives it leeway to do whatever it wants (i.e. the behaviour of the program is undefined).

Let's pretend we fixed that somehow and the behaviour of the program is now well-defined (maybe we added a line initialising d[len-1]).

std::vector<int> d(len);

This line initialises len objects with the value 0. This other line doesn't:

int *d=new int[len];

The only other line that does results in len objects with the value 0 is this one:

int *d=(int *)calloc(len, sizeof(int));

The only conclusion you can draw from the benchmark related to allocation and deallocation performance is that the benchmark is not fit for drawing conclusions related to allocation and deallocation performance.

R. Martinho Fernandes
  • 228,013
  • 71
  • 433
  • 510
  • I know it reads from an unitialized memory. But I don't actually need the value (which may be whatever bytes left over). What `undefined` behavior could it be other than printing out a random value other than zero? – Siyuan Ren Jun 27 '13 at 16:54
  • 4
    @C.R. who the hell knows. What do you think "undefined" means? It doesn't mean "defined to be whatever bytes were left over". It means "not defined". – R. Martinho Fernandes Jun 27 '13 at 17:01
  • 3
    @R.MartinhoFernandes As far as I know, reading from uninitialized memory is _not_ undefined behaviour. Correct me if I'm wrong, but I've always assumed that reading from an uninitialized block in memory is defined to return any value (a random sequence of bits, if you will). – yyny Aug 22 '16 at 21:04
  • In this one example it's **not** UB because this variable can not be optimized to be `register`: https://stackoverflow.com/q/11962457/207791 – Victor Sergienko Apr 12 '18 at 23:08
  • @Victor It's UB because the standard says it's UB. The variable can be "optimized to be `register`" because it's UB. That's what UB means: **anything** can happen, including the variable "optimized to be `register`". – R. Martinho Fernandes Apr 14 '18 at 19:38
  • @Victor also, you're reading an answer for a different language. – R. Martinho Fernandes Apr 14 '18 at 19:44
  • @R.MartinhoFernandes Thanks, you're right, that's C. I withdraw my comment. – Victor Sergienko Apr 15 '18 at 05:05