8

Very much like this question, except that instead vector<int> I have vector<struct myType>.

If I want to reset (or for that matter, set to some value) myType.myVar for every element in the vector, what's the most efficient method?

Right now I'm iterating through:

for(int i=0; i<myVec.size(); i++) myVec.at(i).myVar = 0;

But since vectors are guaranteed to be stored contiguously, there's surely a better way?

Community
  • 1
  • 1
OJFord
  • 10,522
  • 8
  • 64
  • 98
  • 2
    Per this answer http://stackoverflow.com/a/849190/16429, C++ vectors are, in fact, stored contiguously. – Clay Mar 27 '14 at 19:38
  • @Clay I read an answer that they were not (cannot now find) - but that has far more upvotes. Will edit accordingly, thanks. – OJFord Mar 27 '14 at 20:20

6 Answers6

7

Resetting will need to traverse every element of the vector, so it will need at least O(n) complexity. Your current algorithm takes O(n).

In this particular case you can use operator[] instead of at (that might throw an exception). But I doubt that's the bottleneck of your application.

On this note you should probably use std::fill:

std::fill(myVec.begin(), myVec.end(), 0);

But unless you want to go byte level and set a chunk of memory to 0, which will not only cause you headaches but will also make you lose portability in most cases, there's nothing to improve here.

Shoe
  • 74,840
  • 36
  • 166
  • 272
  • Surely `myVec.resize(10, {0})` is O(1)? I wondered if there was a similar way, but keeping values of other variables in the struct. Thanks though! – OJFord Mar 27 '14 at 19:30
  • @OllieFord. No, the complexity of `resize` is [linear](http://en.cppreference.com/w/cpp/container/vector/resize) as well. – Shoe Mar 27 '14 at 19:42
  • Okay, thanks. I have a follow-up question on whether I can merge this into the 'last iteration' of `std::sort`, to reset the values it sorted based on. Please advise whether I should post it as a new question, or edit in the detail? – OJFord Mar 27 '14 at 20:44
  • @OllieFord, what do you mean by "merge"? – Shoe Mar 27 '14 at 21:06
  • Bad choice of word. After I'm done sorting based on `myVal`, I want to set `myVal = 0`. Can I somehow *do* that on last run of `std::sort`? – OJFord Mar 27 '14 at 21:13
  • @OllieFord, you mean `myVar` right? Not `myVal`. But, no, just do `std::sort` and then `std::fill`. The complexity of both operations together is still linear. – Shoe Mar 27 '14 at 21:15
6

Instead of the below code

for(int i=0; i<myVec.size(); i++) myVec.at(i).myVar = 0;

do it as follows:

size_t sz = myVec.size();
for(int i=0; i<sz; ++i) myVec[i].myVar = 0;

As the "at" method internally checks whether index is out of range or not. But as your loop index is taking care(myVec.size()), you can avoid the extra check. Otherwise this is the fastest way to do it.

EDIT

In addition to that, we can store the size() of vector prior to executing the for loop.This would ensure that there is no further call of method size() inside the for loop.

Mantosh Kumar
  • 5,659
  • 3
  • 24
  • 48
  • Nice! Thanks for the extra info on `.at()` - I only used as I know it's 'safer'. Now I know (at least partly) why too :) – OJFord Mar 27 '14 at 19:31
  • Since we are so fond of micro-optimizations, you should also store `myVec.size()` in a variable, instead of performing a function call every loop. – Shoe Mar 27 '14 at 19:36
  • 1
    Oh, and you should probably use `++i` instead of `i++` too. You know. Saving a copy of an `std::size_t` integer can be useful. – Shoe Mar 27 '14 at 19:37
  • @OllieFord: when at(), gets called it calls the size() method and then check whether index is less than size or not. SO in a way this has a very small overhead of calling size() while accessing each element. The operator[] access does just pointer addition and return. I guess this helps. – Mantosh Kumar Mar 27 '14 at 19:39
  • 1
    @tmp: It's calling size() explicitly in each iteration. Call it once before the loop starts and assign it to a varaible. – Kevin Le - Khnle Mar 27 '14 at 19:50
  • @Khnle-KevinLe: Thanks for this pointing out.I have updated... Btw i was referring the size() call inside the at() method in my above comment. – Mantosh Kumar Mar 27 '14 at 20:01
2

One of the fastest ways would be to perform loop unwinding and break the speed limit posed by conventional for loops that cause great cash spills. In your case, as it's a run time thing, there's no way to apply template metaprogramming so a variation on good old Duff's device would do the trick

#include <iostream>
#include <vector>

using namespace std;

struct myStruct {
    int a;
    double b;
};

int main() 
{
    std::vector<myStruct> mV(20);

    double val(20);            // the new value you want to reset to
    int n = (mV.size()+7) / 8; // 8 is the size of the block unroll
    auto to = mV.begin();      // 

    switch(mV.size()%8)
    {
       case 0: do { (*to++).b = val;
       case 7:      (*to++).b = val;
       case 6:      (*to++).b = val;
       case 5:      (*to++).b = val;
       case 4:      (*to++).b = val;
       case 3:      (*to++).b = val;
       case 2:      (*to++).b = val;
       case 1:      (*to++).b = val;
        } while (--n>0);
    }

    // just printing to verify that the value is set
    for (auto i : mV) std::cout << i.b << std::endl;

    return 0;
}

Here I choose to perform an 8-block unwind, to reset the value (let's say) b of a myStruct structure. The block size can be tweaked and loops are effectively unrolled. Remember this is the underlying technique in memcpy and one of the optimizations (loop unrolling in general) a compiler will attempt (actually they're quite good at this so we might as well let them do their job).

Nikos Athanasiou
  • 29,616
  • 15
  • 87
  • 153
  • Duff's device doesn't guarantee that the performance will be faster though. In many cases it has been observed reducing performance. – Veritas Mar 27 '14 at 20:46
1

In addition to what's been said before, you should consider that if you turn on optimizations, the compiler will likely perform loop-unrolling which will make the loop itself faster.

edmz
  • 8,220
  • 2
  • 26
  • 45
1

Also pre increment ++i takes few instructions less than post increment i++. Explanation here

AdUki
  • 392
  • 2
  • 8
1

Beware of spending a lot of time thinking about optimization details which the compiler will just take care of for you.

Here are four implementations of what I understand the OP to be, along with the code generated using gcc 4.8 with --std=c++11 -O3 -S

Declarations:

#include <algorithm>
#include <vector>

struct T {
  int irrelevant;
  int relevant;
  double trailing;
};

Explicit loop implementations, roughly from answers and comments provided to OP. Both produced identical machine code, aside from labels.

                                                        .cfi_startproc
                                                        movq    (%rdi), %rsi
void clear_relevant(std::vector<T>* vecp) {             movq    8(%rdi), %rcx
  for(unsigned i=0; i<vecp->size(); i++) {              xorl    %edx, %edx
    vecp->at(i).relevant = 0;                           xorl    %eax, %eax
  }                                                     subq    %rsi, %rcx
}                                                       sarq    $4, %rcx
                                                        testq   %rcx, %rcx
                                                        je      .L1
                                                        .p2align 4,,10
                                                        .p2align 3
                                                .L5:
void clear_relevant2(std::vector<T>* vecp) {            salq    $4, %rdx
  std::vector<T>& vec = *vecp;                          addl    $1, %eax
  auto s = vec.size();                                  movl    $0, 4(%rsi,%rdx)
  for (unsigned i = 0; i < s; ++i) {                    movl    %eax, %edx
    vec[i].relevant = 0;                                cmpq    %rcx, %rdx
  }                                                     jb      .L5
}                                               .L1:
                                                        rep ret
                                                        .cfi_endproc

Two other versions, one using std::for_each and the other one using the range for syntax. Here there is a subtle difference in the code for the two versions (other than the labels):

                                                        .cfi_startproc
                                                        movq    8(%rdi), %rdx
                                                        movq    (%rdi), %rax
                                                        cmpq    %rax, %rdx
                                                        je      .L17
void clear_relevant3(std::vector<T>* vecp) {            .p2align 4,,10
  for (auto& p : *vecp) p.relevant = 0;                 .p2align 3
}                                               .L21:
                                                        movl    $0, 4(%rax)
                                                        addq    $16, %rax
                                                        cmpq    %rax, %rdx
                                                        jne     .L21
                                                .L17:
                                                        rep ret
                                                        .cfi_endproc


                                                        .cfi_startproc
                                                        movq    8(%rdi), %rdx
                                                        movq    (%rdi), %rax
                                                        cmpq    %rdx, %rax
void clear_relevant4(std::vector<T>* vecp) {            je      .L12
  std::for_each(vecp->begin(), vecp->end(),             .p2align 4,,10
                [](T& o){o.relevant=0;});               .p2align 3
}                                               .L16:
                                                        movl    $0, 4(%rax)
                                                        addq    $16, %rax
                                                        cmpq    %rax, %rdx
                                                        jne     .L16
                                                .L12:
                                                        rep ret
                                                        .cfi_endproc
rici
  • 234,347
  • 28
  • 237
  • 341