Vector initializing slower than array...why?

Question

I tried 2 things: (pseudo code below)

int arr[10000];
for (int i = 0; i < 10000; i++)
{
   for (int j = 0; j < 10000; j++)
   {
       arr[j] = j;
   }
}

and

vector<int> arr(10000);
for (int i = 0; i < 10000; i++)
{
   for (int j = 0; j < 10000; j++)
   {
       arr[j] = j;
   }
}

I ran both the programs and timed it using the "time" shell command. Program 1 runs in 5 seconds, program 2 runs in 30 seconds. I ran both programs with compiler optimization turned on, and both programs ran in about the same time (0.38s). I am confused by these results. Can someone please explain to me why this is happening?

Thanks!

Do you mean that they took 5/30 seconds with optimizations *off*? — jalf, Oct 23 '09 at 18:03
Keep in mind they aren't exactly equivalent. Vector allocates off the heap by default, but the array is on the stack. — GManNickG, Oct 23 '09 at 18:07
Hi jalf, yes that was part of my question. I was also confused by how they executed in the same time after optimization. — Aishwar, Oct 23 '09 at 18:41
Thanks for your answers everyone. This is a great community! I can't believe how fast my question got answered :) — Aishwar, Oct 23 '09 at 18:42

score 16 · Accepted Answer · answered Oct 23 '09 at 17:53

16

For the template, subscripting is done with operator[]. With optimization turned off, that'll usually be generated as a real function call, adding a lot of overhead to something as simple as subscripting into an array. When you turn on optimization, it's generated inline, removing that overhead.

answered Oct 23 '09 at 17:53

Jerry Coffin

476,176
80
629
1,111

Hey Jerry, Thank you for your answer. It makes sense now. I guess with arrays, using [] doesn't make the call to operator[] since it a 'natural' (for a lack of a better word) type. Hence the 5/30. – Aishwar Oct 23 '09 at 18:45
Right -- for a built-in array, the subscripting code is almost certainly *always* generated inline. – Jerry Coffin Oct 23 '09 at 18:50
And this underscores how function call overhead can really add up. – Crashworks Oct 23 '09 at 20:08
1

@aip.cd.aish: The words "intrinsic" and "native" are used more commonly than "natural" to describe features built into a language. – Adisak Oct 26 '09 at 22:00

Khaled Alshaya · Answer 2 · 2009-10-23T18:01:39.970

8

In debugging mode, implementations of std::vector provide a lot of run-time checking for ease of use. This checking is not available for native arrays. For example, in VC2008, if you compile your vector example in debugging mode, there will be range-checking even in the case of operator[].

edited Oct 23 '09 at 18:01

answered Oct 23 '09 at 17:55

Khaled Alshaya

94,250
39
176
234

score 5 · Answer 3 · answered Oct 23 '09 at 17:56

5

If your non-optimized vector implementation is performing bounds checking, that would account for the discrepancy.

answered Oct 23 '09 at 17:56

Dan Blanchard

4,014
1
30
26

++ My guess is you're right, but I wanted him (or her) to find out. – Mike Dunlavey Oct 24 '09 at 13:52

score 4 · Answer 4 · edited May 23 '17 at 11:49

4

These are good answers, but there's a quick way you can find out for yourself.

You're seeing a 6-to-1 difference in performance, right? Just run the slow one and hit the "pause" button. Then look at the call stack. The probability is 5 out of 6 (83%) that you will see exactly how it is spending those 25 extra seconds. Do it several times to get as much insight as you want.

For the optimized case, do the same with program 1. Since it is 13 times slower than the optimized program, you will see the reason why on each "pause", with probability 12/13 = 92%.

That is an application of this technique.

edited May 23 '17 at 11:49

Community

1
1

answered Oct 23 '09 at 20:04

Mike Dunlavey

40,059
14
91
135

1

I've seen many developer's minds blown when first seeing the good ole' pause technique, especially when they don't buy it and spend another serveral hours settiing-up a full profiling run only to discover "pause" already had it nailed. – DanO Oct 23 '09 at 21:52
@DanO: Yeah. I'm just trying to get the word out. There's no reason for people not to know it. – Mike Dunlavey Oct 24 '09 at 00:03
Hey Mike, thanks for the reply. I would like to try it, but I am using a command line environment and gcc to compile my programs (I am new to this environment). Is there someway I can pause it and view the stack trace there? Thanks :) – Aishwar Oct 24 '09 at 04:43
@aip.cd.aish: I wish it were easy. Basically you need to get handy with gdb. Then configure it so that Control-C or Control-Break interrupts it but doesn't abort it. Frankly, under a DOS-box in Windows XP I haven't succeeded, so mostly I use VC & VS. Under Unix, I used to use adb, and send it a kill signal from outside. I've asked on SO if there is a better way, but not really gotten a good answer. – Mike Dunlavey Oct 24 '09 at 13:12
... There's a clumsy way that I've sometimes used. Write an empty function foo(){} and sprinkle calls to it throughout the code. Put a breakpoint in it, and disable that. Then, when it's running, see if you can enable the breakpoint at a random time. This method is not as precise, but better than nothing. – Mike Dunlavey Oct 24 '09 at 13:15
... There's another way that could work in your case. In gdb, just single-step at the instruction level "si". You'll see exactly what it's doing (though you may not like it :). – Mike Dunlavey Oct 24 '09 at 13:18
@DanO: I'm reminded of an Edison story, where he was interviewing some engineers. He asked them to find the volume of a light bulb globe. Most of them set to work with calculus and geometry. One just filled it with water and poured it into a measuring cup! – Mike Dunlavey Oct 24 '09 at 13:44
@aip.cd.aish: Use process explorer from sysinternals/technet, on the process properties page | threads tab, pick a thread and hit the stack button. You get a snapshot from when you pushed the button. look at what is on top closing it and doining it again, you can easily get a new sample nearly every second. Of course without setting up symbols you only get offsets. but you also get to see into the system Dlls this way. – DanO Oct 28 '09 at 06:18
@DanO: I wasn't aware of the process explorer, thanks. It's nice that you can see into system Dlls, but what I look for is mid-stack calls in code that I have control over. Example: http://stackoverflow.com/questions/406760/whats-your-most-controversial-programming-opinion/1562802#1562802 – Mike Dunlavey Oct 28 '09 at 11:36

score 0 · Answer 5 · answered Oct 23 '09 at 19:14

0

because when you write vector arr(10000); you create an object, call it's functions... when and it will be slower than whe you create int arr[10000];

answered Oct 23 '09 at 19:14

Davit Siradeghyan

6,053
6
24
29

score 0 · Answer 6 · answered Oct 23 '09 at 19:57

In your examples, the array is on the stack. Accessing data in the array involves accessing data on the stack. That's fast.

On the other hand, while the vector is on the stack, the data for a std::vector is allocated somewhere else (by default it's allocated on the heap via std::allocator). Accessing data in the vector involves accessing data on the heap. That's much slower than accessing data on the stack.

You get something for the performance penalty, though. std::vector is growable, while a regular array is not. Also, the size of a std::vector does not have to be compile time constant, while the size of an array on the stack does. A heap-allocated array (via operator new[]) does not have to be a compile-time constant. If you compare a heap allocated array with a std::vector you'll find the performance is much closer.

int* arr = new int[10000];
for (int i = 0; i < 10000; i++)
{
   for (int j = 0; j < 10000; j++)
   {
       arr[j] = j;
   }
}

delete[] arr; // std::vector does this for you

Vector initializing slower than array...why?

6 Answers6