1

I am currently working on a c++ school project with some friends.

Before when i had vectors in c++ i did something like this to use them :

unsigned int i = 0;
while (i != myVector.size())
{
    doSomething(myVector[i]);
    i++;
}

But during this project my friends were not happy seeing me using vectors like this and asked me to use iterators. I don't really enjoy iterators because their syntax is quite hard to remember but my friends said that is was better using them because it works faster. And since we are working in a big project with a lot of vectors it was crucial using iterators.

Time has passed and i'm still using them even if i still can't remember their syntax but i wanted to see if the iterator method was really faster them the "unsigned int" method.

So i made this 2 programs:

First program using the unsigned int method:

#include <vector>
#include <string>
#include <iostream>

int main()
{
    std::string str = "This is a string";

    int i = 0;
    std::vector<std::string> vec;

    while (i != 10000000)
    {
        vec.push_back(str);
        i++;
    }

    unsigned int j = 0;
    while (j != vec.size())
    {
        std::cout << vec[j] << std::endl;
        j++;
    }
    return (0);
}

And the second program using the iterator method:

#include <vector>
#include <string>
#include <iostream>

int main()
{
    std::string str = "This is a string";

    int i = 0;
    std::vector<std::string> vec;

    while (i != 10000000)
    {
        vec.push_back(str);
        i++;
    }

    std::vector<std::string>::iterator it;
    it = vec.begin();
    while (it != vec.end())
    {
        std::cout << *it << std::endl;
        it++;
    }
    return (0);
}

As you can see both programs will first create a vector with a size of 10 000 000 (i've put a big size so if there's a difference in time it will be easier to notice) and then i will just print the string in the vector but using the two different methods.

I used time on linux to know the execution time of each program like this:

time ./a.out

And here's the result:

The unsigned int method:

real    0m39,391s
user    0m5,463s
sys     0m21,108s

The iterator method:

real    0m39,436s
user    0m5,972s
sys     0m20,652s

And ......... it's the same time ?! There's just a negligible less than 1 second difference between both and it's a vector with 10 million strings in it.

So i was wondering is there really a difference between this two methods and are iterators really better to use ?

electo
  • 108
  • 11
  • 6
    Make sure to run any benchmark with full optimizations. I wouldn't expect any big difference between the two, but it's good to get in the habits of using iterators as all containers have them but not all containers allow random access by index. – François Andrieux Nov 29 '18 at 21:02
  • 3
    `decltype(myVector.size())` will always get the type right.. also, there's `for (const auto& element : myVector)` as a nice alternative. – Jesper Juhl Nov 29 '18 at 21:03
  • 4
    related/dupe: https://stackoverflow.com/questions/2524233/speed-accessing-a-stdvector-by-iterator-vs-by-operator-index. If you don't need the index, just use a ranged based for loop. – NathanOliver Nov 29 '18 at 21:05
  • 2
    Using iterators is more generic and work with all STL containers, very unlikely you would see any speed difference in case of `std::vector` – Slava Nov 29 '18 at 21:06
  • 3
    The theory behind iterators is sound, and is the basis for composing the standard library (containers expose iterators that can be used generically with algorithms). C++ is moving in more range-based direction (yay), because oftentimes you want to iterate over the *entire* sequence of something (and ranges themselves can be naively thought to be composed of iterators). In the end, the compiler is smart and it will optimize all the different ways to the same assembly. So, do what best suits the problem at hand. – AndyG Nov 29 '18 at 21:09
  • 3
    Any difference in performance in the access methods in these code examples will be swamped by the creation of the vectors and the I/O. These numbers don’t say anything useful. – Pete Becker Nov 29 '18 at 21:16
  • I think you should look at how vector 'growth' using push_back() happens. – 2785528 Nov 29 '18 at 21:31
  • 2
    For testing, you should re-direct the apps output to a file. Screen i/o can be surprisingly slow. – 2785528 Nov 29 '18 at 21:32
  • Iterators are generally safer to use than indexes. That's the main reason to use them imo. – Galik Nov 29 '18 at 21:39
  • @Galik care to elaborate? Why do you think they are **safer**? – SergeyA Nov 29 '18 at 21:41
  • 1
    For speed tests, don't use strings and don't use `std::cout` in the testing look. Use `int` and add all the values up inside the loop. Print the result after the test loop is over. – Galik Nov 29 '18 at 21:41
  • @SergeyA I feel so because there is less math `v[v.size() - 1]` type code with iterators. Also less chance to mismatch signed/unsigned values in a loop. Plus iterators are type specific in a way that integers are not. Plus iterators tend to come from safer sources than indexes like `std::begin()` or the return value of an algorithm like `std::find()`. – Galik Nov 29 '18 at 21:47

2 Answers2

5

The main reason to use iterators is not performance, but less possibilities for mistakes and more expressive code. Compare this

unsigned int i = 0;
while (i != myVector.size())
{
    doSomething(myVector[i]);
    i += 2;
}

or

unsigned int start = myVector.size() + 42;

for (unsigned int i = start; i != myVector.size(); ++i){
    doSomething(myVector[i]);
}

with

for (const auto& e : myVector) {
     doSomething(e);
}

Range based for loops makes using iterators as simple as it can get (you dont even see the iterators, but they are used behind the scenes). When you manually manage the index there are millions of ways to get it wrong, with iterators there are maybe 2 or 3.

For your performance comparison: As vectors store their element in contiguous memory, vector iterators can be plain pointers. What you think is overhead is mainly syntactic sugar to enable you to write nicer code. Hence its not a big surprise that you dont see much difference.

PS

i used it a lot i am kinda confident of not making too much mistakes

Using an integer to iterate an array is from the last century. Its unsafe, it leads to hard to detect bugs and can easily invoke undefined behaviour. Write code to express what you want to do, not to instruct your processor. If you want to do something for each element of a vector you should a range based for loop or the older std::for_each:

std::for_each(myVector.begin(),myVector.end(),doSomething);

It has not any of the downsides of manually using an index (did you spot the mistakes in the above loops?) and has the advantage of looking the same no matter what container myVector actually is or what type of element it contains, or what doSomething actually is (it can be a free function, a functor, a lambda, your choice).

463035818_is_not_an_ai
  • 109,796
  • 11
  • 89
  • 185
  • i will speak of it with my friends but i'll try using iterators when working in group projects so the other members will be happy, but in a solo project i'll probably use my unsigned int method i really prefer it and since i used it a lot i am kinda confident of not making too much mistakes. Thanks for the clear answer – electo Nov 29 '18 at 21:29
  • 2
    @electo at some point you should take a look at the algorithms in the standard library. Once you do that you cannot get around getting used to iterators – 463035818_is_not_an_ai Nov 29 '18 at 21:30
  • @electo using unsigned int to iterate over indexes of a vector is likely a bug. – SergeyA Nov 29 '18 at 21:33
  • Do not get into habit of using **signed** integers in index iteration loops even for examples. This could provide for spectacular effects. – SergeyA Nov 29 '18 at 21:39
  • @SergeyA hm it was half-intentional to give a bad example, but you are right, the examples are wrong enough also with `unsigned` – 463035818_is_not_an_ai Nov 29 '18 at 21:52
  • @electo updated the answer with an example of for_each – 463035818_is_not_an_ai Nov 29 '18 at 21:59
  • @electo At least use `std::size_t` instead of `unsigned int`, so your code correctly ports to 64-bit architectures. On some platforms, like Windows, `unsigned int` is always 32-bit, regardless of architecture. – zett42 Nov 29 '18 at 22:06
3

Surprisingly enough, there might be at least theoretical performance difference when comparing iterator access with index access in the loop, as can be shown here: https://gcc.godbolt.org/z/frFYhF

Filtering for some noise, with iterator every iteration looks like

.LBB0_1:                                # =>This Inner Loop Header: Depth=1
    movl    (%rbx), %edi
    callq   check(int)
    addq    $4, %rbx
    cmpq    %rbx, %r14
    jne     .LBB0_1

So here we see one memory access, one math operation, and one conditional branch. Overall, memory access will dwarf everything else once you are out of the cache line, but still those are operations taken.

When we look into index access, iteration looks like:

.LBB1_3:                                # =>This Inner Loop Header: Depth=1
    movq    (%r14), %rax
    movl    (%rax,%rbx,4), %edi
    callq   check(int)
    addq    $1, %rbx
    cmpq    %r15, %rbx
    jb      .LBB1_3

Here we see something we do not see in the previous example - an extra register move on every iteration (which is needed for displacement memory access).

Now, a register move is probably one of the cheapest real operations CPU can perform, but it still an operation, and it would be a reordering block, as later operation depends on it's result.

I believe, the performance impact we see here should not be something you think about when you are accessing the vector. Instead, you should strike for uniformity, ease of read and maintainability.

Having said all that, I suggest you opt for range-based loop.

for (int i: vec) {
     // work with i
}

Last, but not the list, using a variable of unsigned int to iterate over indexes of a vector could be a nasty bug. On many platforms vector could be larger than maxim int allows, and you will end up with the endless loop.

SergeyA
  • 61,605
  • 5
  • 78
  • 137