8

I tried 2 things: (pseudo code below)

int arr[10000];
for (int i = 0; i < 10000; i++)
{
   for (int j = 0; j < 10000; j++)
   {
       arr[j] = j;
   }
}

and

vector<int> arr(10000);
for (int i = 0; i < 10000; i++)
{
   for (int j = 0; j < 10000; j++)
   {
       arr[j] = j;
   }
}

I ran both the programs and timed it using the "time" shell command. Program 1 runs in 5 seconds, program 2 runs in 30 seconds. I ran both programs with compiler optimization turned on, and both programs ran in about the same time (0.38s). I am confused by these results. Can someone please explain to me why this is happening?

Thanks!

Aishwar
  • 9,284
  • 10
  • 59
  • 80
  • 2
    Do you mean that they took 5/30 seconds with optimizations *off*? – jalf Oct 23 '09 at 18:03
  • Keep in mind they aren't exactly equivalent. Vector allocates off the heap by default, but the array is on the stack. – GManNickG Oct 23 '09 at 18:07
  • Hi jalf, yes that was part of my question. I was also confused by how they executed in the same time after optimization. – Aishwar Oct 23 '09 at 18:41
  • Thanks for your answers everyone. This is a great community! I can't believe how fast my question got answered :) – Aishwar Oct 23 '09 at 18:42

6 Answers6

16

For the template, subscripting is done with operator[]. With optimization turned off, that'll usually be generated as a real function call, adding a lot of overhead to something as simple as subscripting into an array. When you turn on optimization, it's generated inline, removing that overhead.

Jerry Coffin
  • 476,176
  • 80
  • 629
  • 1,111
  • Hey Jerry, Thank you for your answer. It makes sense now. I guess with arrays, using [] doesn't make the call to operator[] since it a 'natural' (for a lack of a better word) type. Hence the 5/30. – Aishwar Oct 23 '09 at 18:45
  • Right -- for a built-in array, the subscripting code is almost certainly *always* generated inline. – Jerry Coffin Oct 23 '09 at 18:50
  • And this underscores how function call overhead can really add up. – Crashworks Oct 23 '09 at 20:08
  • 1
    @aip.cd.aish: The words "intrinsic" and "native" are used more commonly than "natural" to describe features built into a language. – Adisak Oct 26 '09 at 22:00
8

In debugging mode, implementations of std::vector provide a lot of run-time checking for ease of use. This checking is not available for native arrays. For example, in VC2008, if you compile your vector example in debugging mode, there will be range-checking even in the case of operator[].

Khaled Alshaya
  • 94,250
  • 39
  • 176
  • 234
5

If your non-optimized vector implementation is performing bounds checking, that would account for the discrepancy.

Dan Blanchard
  • 4,014
  • 1
  • 30
  • 26
4

These are good answers, but there's a quick way you can find out for yourself.

You're seeing a 6-to-1 difference in performance, right? Just run the slow one and hit the "pause" button. Then look at the call stack. The probability is 5 out of 6 (83%) that you will see exactly how it is spending those 25 extra seconds. Do it several times to get as much insight as you want.

For the optimized case, do the same with program 1. Since it is 13 times slower than the optimized program, you will see the reason why on each "pause", with probability 12/13 = 92%.

That is an application of this technique.

Community
  • 1
  • 1
Mike Dunlavey
  • 40,059
  • 14
  • 91
  • 135
  • 1
    I've seen many developer's minds blown when first seeing the good ole' pause technique, especially when they don't buy it and spend another serveral hours settiing-up a full profiling run only to discover "pause" already had it nailed. – DanO Oct 23 '09 at 21:52
  • @DanO: Yeah. I'm just trying to get the word out. There's no reason for people not to know it. – Mike Dunlavey Oct 24 '09 at 00:03
  • Hey Mike, thanks for the reply. I would like to try it, but I am using a command line environment and gcc to compile my programs (I am new to this environment). Is there someway I can pause it and view the stack trace there? Thanks :) – Aishwar Oct 24 '09 at 04:43
  • @aip.cd.aish: I wish it were easy. Basically you need to get handy with gdb. Then configure it so that Control-C or Control-Break interrupts it but doesn't abort it. Frankly, under a DOS-box in Windows XP I haven't succeeded, so mostly I use VC & VS. Under Unix, I used to use adb, and send it a kill signal from outside. I've asked on SO if there is a better way, but not really gotten a good answer. – Mike Dunlavey Oct 24 '09 at 13:12
  • ... There's a clumsy way that I've sometimes used. Write an empty function foo(){} and sprinkle calls to it throughout the code. Put a breakpoint in it, and disable that. Then, when it's running, see if you can enable the breakpoint at a random time. This method is not as precise, but better than nothing. – Mike Dunlavey Oct 24 '09 at 13:15
  • ... There's another way that could work in your case. In gdb, just single-step at the instruction level "si". You'll see exactly what it's doing (though you may not like it :). – Mike Dunlavey Oct 24 '09 at 13:18
  • @DanO: I'm reminded of an Edison story, where he was interviewing some engineers. He asked them to find the volume of a light bulb globe. Most of them set to work with calculus and geometry. One just filled it with water and poured it into a measuring cup! – Mike Dunlavey Oct 24 '09 at 13:44
  • @aip.cd.aish: Use process explorer from sysinternals/technet, on the process properties page | threads tab, pick a thread and hit the stack button. You get a snapshot from when you pushed the button. look at what is on top closing it and doining it again, you can easily get a new sample nearly every second. Of course without setting up symbols you only get offsets. but you also get to see into the system Dlls this way. – DanO Oct 28 '09 at 06:18
  • @DanO: I wasn't aware of the process explorer, thanks. It's nice that you can see into system Dlls, but what I look for is mid-stack calls in code that I have control over. Example: http://stackoverflow.com/questions/406760/whats-your-most-controversial-programming-opinion/1562802#1562802 – Mike Dunlavey Oct 28 '09 at 11:36
0

because when you write vector arr(10000); you create an object, call it's functions... when and it will be slower than whe you create int arr[10000];

Davit Siradeghyan
  • 6,053
  • 6
  • 24
  • 29
0

In your examples, the array is on the stack. Accessing data in the array involves accessing data on the stack. That's fast.

On the other hand, while the vector is on the stack, the data for a std::vector is allocated somewhere else (by default it's allocated on the heap via std::allocator). Accessing data in the vector involves accessing data on the heap. That's much slower than accessing data on the stack.

You get something for the performance penalty, though. std::vector is growable, while a regular array is not. Also, the size of a std::vector does not have to be compile time constant, while the size of an array on the stack does. A heap-allocated array (via operator new[]) does not have to be a compile-time constant. If you compare a heap allocated array with a std::vector you'll find the performance is much closer.

int* arr = new int[10000];
for (int i = 0; i < 10000; i++)
{
   for (int j = 0; j < 10000; j++)
   {
       arr[j] = j;
   }
}

delete[] arr; // std::vector does this for you
Max Lybbert
  • 19,717
  • 4
  • 46
  • 69