C++: Optimizing speed with vector/array?

Question

I have a nested for-loop structure and right now I am re-declaring the vector at the start of each iteration:

void function (n1,n2,bound,etc){

    for (int i=0; i<bound; i++){
             vector< vector<long long> > vec(n1, vector<long long>(n2));
             //about three more for-loops here
    }
}

This allows me to "start fresh" each iteration, which works great because my internal operations are largely in the form of vec[a][b] += some value. But I worry that it's slow for large n1 or large n2. I don't know the underlying architecture of vectors/arrays/etc so I am not sure what the fastest way is to handle this situation. Should I use an array instead? Should I clear it differently? Should I handle the logic differently altogether?

EDIT: The vector's size technically does not change each iteration (but it may change based on function parameters). I'm simply trying to clear it/etc so the program is as fast as humanly possible given all other circumstances.

EDIT:

My results of different methods:

Timings (for a sample set of data):
reclaring vector method: 111623 ms
clearing/resizing method: 126451 ms
looping/setting to 0 method: 88686 ms

post the 'three more for loops', the type of data structure you should use depends on what you plan on doing with it — , Apr 13 '12 at 13:22
AFAIK the fastest way to clear a vector is to std::swap it (std::swap(vec, vector()). Could be wrong though. imho without knowing more about your algorithm, it's hard to say, as Justin says. — Robinson, Apr 13 '12 at 13:22
How large are `bound`, `n1` and `n2` in relation to each other? — leftaroundabout, Apr 13 '12 at 13:23
There's another i-to-bound for loop and then a few other smaller ones that are trivial (in other words, it's O(n^2)). n1 is in the millions and bound is around there, too. — John Smith, Apr 13 '12 at 13:25

score 12 · Answer 1 · answered Apr 13 '12 at 13:23

12

I have a clear preference for small scopes (i.e. declaring the variable in the innermost loop if it’s only used there) but for large sizes this could cause a lot of allocations.

So if this loop is a performance problem, try declaring the variable outside the loop and merely clearing it inside the loop – however, this is only advantageous if the (reserved) size of the vector stays identical. If you are resizing the vector, then you get reallocations anyway.

Don’t use a raw array – it doesn’t give you any advantage, and only trouble.

answered Apr 13 '12 at 13:23

Konrad Rudolph

530,221
131
937
1,214

A raw array is indeed bad, but it _does_ give the advantage of no heap allocations. `std::array` is often a great compromise. – leftaroundabout Apr 13 '12 at 13:25
1

@leftaroundabout You are talking about *statically* allocated arrays, that’s completely different. And yes, using those would have an advantage. I’m assuming that the OP is talking about dynamic sizes and there a raw array also uses heap allocations. – Konrad Rudolph Apr 13 '12 at 13:26
Actually the vector's size doesn't change each iteration, editing OP – John Smith Apr 13 '12 at 13:30
Would it be better to use an array since I know what n1 and n2 will be from the outset of the function? – John Smith Apr 13 '12 at 14:01
@John No. C arrays are *not* more efficient than vectors. The only difference is when the size is known at compile time. In that case, a `std::array` is more efficient than a `std::vector`. – Konrad Rudolph Apr 13 '12 at 14:05
Maybe I should mess around with this then (I only call my function a few times)... is there a good way to clear it? – John Smith Apr 13 '12 at 14:07
Also, I don't understand what you mean by "clearing it inside the loop" being an advantage if the size stays identical. Clearing the vector seems to kill off the size altogether. – John Smith Apr 13 '12 at 14:11
1

@John `clear` kills the size but doesn’t actually free up any memory. Hence, reallocation doesn’t happen. `clear` followed by `resize` reinitialises the vector’s elements without costly heap allocations. – Konrad Rudolph Apr 13 '12 at 14:19
So when I redeclare my vector each iteration, is this costly? Or is the clear/resize approach better? How do I properly resize? – John Smith Apr 13 '12 at 14:22
Just timed both approaches (redeclare vs. clear/resize) and it seems like redeclaring wins in terms of speed – John Smith Apr 13 '12 at 14:39
@John I have to confess that this surprises me somewhat but it’s not inconceivable. – Konrad Rudolph Apr 13 '12 at 14:44
1

@JohnSmith did you `clear` the _whole_ outer `std::vector`? That does call all the destructors of the inner `std::vector`s, so most of the reallocations still take place with this approach. Only with calling `clear` on each of the _elements_ you get the no-reallocations bahaviour. – leftaroundabout Apr 13 '12 at 15:34

James Custer · Accepted Answer · 2012-04-13T18:08:08.270

10

Here is some code that tests a few different methods.

#include <chrono>
#include <iostream>
#include <vector>

int main()
{
  typedef std::chrono::high_resolution_clock clock;

  unsigned n1 = 1000;
  unsigned n2 = 1000;

  // Original method
  {
    auto start = clock::now();
    for (unsigned i = 0; i < 10000; ++i)
    {
      std::vector<std::vector<long long>> vec(n1, std::vector<long long>(n2));
      // vec is initialized to zero already

      // do stuff
    }
    auto elapsed_time = clock::now() - start;

    std::cout << elapsed_time.count() << std::endl;
  }


  // reinitialize values to zero at every pass in the loop
  {
    auto start = clock::now();
    std::vector<std::vector<long long>> vec(n1, std::vector<long long>(n2));
    for (unsigned i = 0; i < 10000; ++i)
    {
      // initialize vec to zero at the start of every loop
      for (unsigned j = 0; j < n1; ++j)
        for (unsigned k = 0; k < n2; ++k)
            vec[j][k] = 0;

      // do stuff
    }
    auto elapsed_time = clock::now() - start;

    std::cout << elapsed_time.count() << std::endl;
  }

  // clearing the vector this way is not optimal since it will destruct the
  // inner vectors
  {
    auto start = clock::now();
    std::vector<std::vector<long long>> vec(n1, std::vector<long long>(n2));
    for (unsigned i = 0; i < 10000; ++i)
    {
      vec.clear();
      vec.resize(n1, std::vector<long long>(n2));

      // do stuff
    }
    auto elapsed_time = clock::now() - start;

    std::cout << elapsed_time.count() << std::endl;
  }

  // equivalent to the second method from above
  // no performace penalty
  {
    auto start = clock::now();
    std::vector<std::vector<long long>> vec(n1, std::vector<long long>(n2));
    for (unsigned i = 0; i < 10000; ++i)
    {
      for (unsigned j = 0; j < n1; ++j)
      {
        vec[j].clear();
        vec[j].resize(n2);
      }

      // do stuff
    }
    auto elapsed_time = clock::now() - start;

    std::cout << elapsed_time.count() << std::endl;
  }
}

Edit: I've updated the code to make a fairer comparison between the methods. Edit 2: Cleaned up the code a bit, methods 2 or 4 are the way to go.

Here are the timings of the above four methods on my computer:

The point is that you should try out different methods and profile your code.

edited Apr 13 '12 at 18:08

answered Apr 13 '12 at 13:48

James Custer

857
6
12

I need to reset the values each iteration though somehow, and quickly, since my bound is large – John Smith Apr 13 '12 at 13:51
Given the code you posted originally, you have to be doing that somewhere inside the loops anyway, correct? I edited my answer to show how vec might be initialized somewhere in your inner loop. – James Custer Apr 13 '12 at 13:53
@JohnSmith I've added some timings to the various methods in my answer. – James Custer Apr 13 '12 at 14:24
I do a lot of adding-to-element operations in my inner loops and so clearing is a necessity in this case -- I can't use the second method. – John Smith Apr 13 '12 at 14:31
@JohnSmith Do you mean you use `vec.push_back(...)`? I don't think we'll be able to fully answer your question without knowing what you do in your inner loops. – James Custer Apr 13 '12 at 14:34
I do a lot of vec[a][b] += vec[a-some val][b/2]*...etc type operations (dynamic programming). In other words, processes that only work assuming that the vector's elements are 0'd out at the start. I'm not changing the vector's size at any point; just the values. – John Smith Apr 13 '12 at 14:38
I mean my code *works* -- redeclaring it, for whatever purposes it's doing, is acting as if it's setting everything to 0 off the bat. Let me try your timing approach by manually setting everything to 0 without any redeclarions/resizings, one sec. – John Smith Apr 13 '12 at 14:44
Ooohhh, this is faster. Posting timings above, one moment – John Smith Apr 13 '12 at 14:48
@James Custer `std::vector> vec(n1, std::vector(n2));` *DOES* in fact initialize the values to zero, using value initialization. – Mark B Apr 13 '12 at 15:11
@MarkB Yes, you are correct. My mistake. I'm used to dealing with another type which doesn't do initialization on its values. – James Custer Apr 13 '12 at 15:20
Aren’t all your `resize` and `clear` operations in the wrong order? *First* `clear`, *then* `resize`. I don’t think this will affect performance but it’s quite crucial if you actually need to work with the data. ;-) – Also, your fourth code only needs one loop, and this is indeed what I would suggest using. I find it slightly clearer than manually zeroing the elements. – Konrad Rudolph Apr 13 '12 at 17:44
@KonradRudolph I was mimicking someone else's code where they were clearing at the bottom of the loop. You're right that the method is now a little cleaner than initializing each element to zero, and there is no performance penalty for clearing/resizing compared to simply reinitializing. I've edited my code to reflect these changes. – James Custer Apr 13 '12 at 18:17
Ah, that made sense. Anyway, now it’s awesome. – Konrad Rudolph Apr 13 '12 at 19:28

score 6 · Answer 3 · edited May 23 '17 at 12:30

6

When choosing a container i usually use this diagram to help me:

enter image description here

source

Other than that,

Like previously posted if this is causing performance problems declare the container outside of the for loop and just clear it at the start of each iteration

edited May 23 '17 at 12:30

Community

1
1

answered Apr 13 '12 at 13:29

A nice guide, but in this case the OP is already at the best node shown for his problem (`std::vector`), which may however not actually be the best option there is. – leftaroundabout Apr 13 '12 at 13:34
I don’t see how this diagram is relevant to the question. – Konrad Rudolph Apr 13 '12 at 13:35
He has not indicated what is happening in the contents of his for loop, how could you possibly know that vector is the best node ?, It's relevant because as he states "I don't know the underlying architecture of vectors/arrays/etc" – Apr 13 '12 at 13:37
I don't agree with this because of: [Why you shouldn't use set (and what you should use instead)](http://lafstern.org/matt/col1.pdf) – bames53 Apr 13 '12 at 13:56
@bames53 you are absolutely correct!, Unfortunately this diagram only maps containers within the STL, This guide is a simple reference for making a 'better' decision, not the 'best possible' decision. – Apr 13 '12 at 14:24
Agree with KR, think the performance of arrays vs vector is not addressed here. – ManiP Apr 13 '12 at 14:36
1

Even with my caveat mentioned above, and even with bames’, I fail to understand the downvotes on this answer … – Konrad Rudolph Apr 13 '12 at 14:45
@KonradRudolph I think 'not relevant' is sufficient reason to downvote. Even if the information is good it's not an answer. – bames53 Apr 13 '12 at 15:15
Explain to me how this is 'not relevant'.... He's worried about the performance deprecation of re-declaring a vector, and i get that(if you read the "Other than that" under the diagram you will see I have addressed that); BUT, Choosing the right STL container for the rest of the code will have a much greater impact on performance then mere re-declaration, and as he wrote "I don't know the underlying architecture of vectors/arrays/etc" that makes the diagram completely relevant. – Apr 13 '12 at 15:25
Please keep comments constructive. Move extended discussions to [chat]. – Tim Post Apr 25 '12 at 18:33
Is there a similar diagram for C#/.Net? – asaf92 Aug 26 '19 at 09:47

Nicolas Repiquet · Answer 4 · 2012-04-13T13:57:07.213

0

Why not something like that :

{
    vector< vector<long long> > vec(n1, vector<long long>(n2));

    for (int i=0; i<bound; i++){

         //about three more for-loops here

         vec.clear();
    }
}

Edit: added scope braces ;-)

edited Apr 13 '12 at 13:57

answered Apr 13 '12 at 13:28

Nicolas Repiquet

9,097
2
31
53

See my answer. This has the (admittedly, relatively small) problem of scope leak. In a moderately complicated code, keeping track of variables (their state, and what they do) is vastly easier the smaller their scope is. I have written code where decreasing a variable’s scope has made a big difference in debuggability and understandability. – Konrad Rudolph Apr 13 '12 at 13:34
For some reason my program crashes when I try this? – John Smith Apr 13 '12 at 13:49
1

That's because when you clear the vector, the size of the vector is also cleared. That is, `vec.size() == 0`. If you clear the vector you must resize it. – James Custer Apr 13 '12 at 13:52
Is there an efficient way to clear it without resizing it? Or is simply redeclaring the vector like I've been doing the best I can do? – John Smith Apr 13 '12 at 13:56
@John: Clearing erases all the existing elements, so it would make no sense for the size to be anything but 0 since all the contained objects were already destroyed! The right course of action would be `clear` followed by `resize`, or just recreating the vector. – danielkza Apr 16 '12 at 16:56

score 0 · Answer 5 · answered Apr 13 '12 at 13:28

0

In addition to the previous comments :

if you use Robinson's swap method, you could go ever faster by handling that swap asynchronously.

answered Apr 13 '12 at 13:28

Skippy Fastol

1,745
2
17
32

Alex Z · Answer 6 · 2012-04-13T14:29:52.013

0

Well if you are really concerned about performance (and you know the size of n1 and n2 beforehand) but don't want to use a C-style array, std::array may be your friend.

EDIT: Given your edit, it seems an std::array isn't an appropriate substitute since while the vector size does not change each iteration, it still isn't known before compilation.

edited Apr 13 '12 at 14:29

answered Apr 13 '12 at 13:31

Alex Z

2,500
2
19
23

I am very much concerned about performance; these loops are large and I must optimize for speed – John Smith Apr 13 '12 at 13:33
`clear` won't do the trick on its own though. You'd have to `resize` back out to the needed dimensions again, and at that point you might as well do it the obvious way that the OP already uses. – Mark B Apr 13 '12 at 14:27

score 0 · Answer 7 · answered Apr 13 '12 at 14:40

Since you have to reset the vector values to 0 each iteration, in practical terms, this question boils down to "is the cost of allocating and deallocating the memory for the vector cheap or expensive compared to the computations inside the loops".

Assuming the computations are the expensive part of the algorithm, the way you've coded it is both clear, concise, shows the intended scope, and is probably just as fast as alternate approaches.

If however your computations and updates are extremely fast and the allocation/deallocation of the vector is relatively expensive, you could use std::fill to fill zeroes back into the array at the end/beginning of each iteration through the loop.

Of course the only way to know for sure is to measure with a profiler. I suspect you'll find that the approach you took won't show up as a hotspot of any sort and you should leave the obvious code in place.

score -1 · Answer 8 · answered Apr 13 '12 at 13:25

-1

The overhead of using a vector vs an array is minor, especially when you are getting a lot of useful functionality from the vector. Internally a vector allocates an array. So vector is the way to go.

answered Apr 13 '12 at 13:25

ManiP

713
2
8
19

3

The overhead of _operating_ on `std::vector`s is minor, but they have quite a lot of overhead in _allocation_ because they reside on the heap. – leftaroundabout Apr 13 '12 at 13:27
1

@leftaroundabout I am referring to dynamically allocated arrays which are allocated on the heap. The function in the loop is being passed both as variable values (n1, n2). – ManiP Apr 13 '12 at 13:52

C++: Optimizing speed with vector/array?

8 Answers8

Linked

Related