4

I have two questions regarding iterators.

  1. I thought the once you define an iterator to an STL container such as a vector or a list, if you add elements to the containers then these iterators won't be able to access them. But the following code defines a list of five elements and then adds another element in each loop iteration and results in an infinite loop:

    #include <iostream>
    #include <list>
    
    using namespace std;
    
    int main()
    {
        list<int> ls;
    
        for(int i = 0; i < 5; i++)
        {
            ls.push_back(i);
        }
    
        int idx = 0;
    
        for(list<int>::iterator iter = ls.begin(); iter != ls.end(); iter++)
        {
            cout << "idx: " << idx << ", *iter: " << *iter << endl;
            ls.push_back(7);
            idx++;
        }
    }
    

    However, doing the same for a vector results in an error:

    #include <iostream>
    #include <vector>
    
    using namespace std;
    
    int main()
    {
        vector<int> vec;
    
        for(int i = 0; i < 5; i++)
        {
            vec.push_back(i);
        }
    
        int idx = 0;
    
        for(vector<int>::iterator iter = vec.begin(); iter != vec.end(); iter++)
        {
            cout << "idx: " << idx << ", *iter: " << *iter << endl;
            vec.push_back(7);
            idx++;
        }
    }
    
  2. I thought that when the vector container must be resized, it does so at powers of 2 and is located to a new area of memory, which is why you shouldn't define an iterator to a vector if you adding elements to it (since the iterators don't get passed to the new memory location). For example, I thought a vector containing 16 elements, after calling the push_back function, will be allocated space for 32 elements and the entire vector will be relocated. However, the this didn't happen for the following code. Was I just mistaken?

    #include <iostream>
    #include <vector>
    
    using namespace std;
    
    int main()
    {
        vector<int> vec;
    
        for(int i = 0; i < 4; i++)
        {
            vec.push_back(i);
            cout << "address of vec: " << &vec << ", capacity: " << vec.capacity() << endl;
        }
    
    
    
        for(int i = 0; i < 20; i++)
        {
            vec.push_back(i);
            cout << "address of vec: " << &vec << ", capacity: " << vec.capacity() << endl;
        }
    }
    
Joseph Mansfield
  • 108,238
  • 20
  • 242
  • 324
bcf
  • 2,104
  • 1
  • 24
  • 43
  • 6
    Don't assume things. The documentation of stdlib classes clearly describes when and which iterators are invalidated by certain types of mutation on a container. –  Dec 24 '13 at 20:41
  • 1
    Here's a good answer about when iterators are invalidated: http://stackoverflow.com/a/6438087/57318 – Xymostech Dec 24 '13 at 20:46
  • 1
    Note that an invalidated iterator may be indistinguishable from a valid iterator, if you just look at pointer values. ie. When `vector` resizes, its new storage may start at the same address as the old storage. Even if you were _guaranteed_ that `vector` increased its `capacity()`, you're not guaranteed it's at a different addresss. Only the rules of `vector` tell you when an iterator is considered valid or invalid. – Joe Z Dec 24 '13 at 20:48
  • 1
    The rate at which a `vector`'s capacity increases is not mandated by the standard. An implementation can do it at a rate of 1.5x, 2x, 3x, or whatever it wants, so long as it keeps the amortized complexity of `push_back()` O(1) (and keeps the complexity requirements of the other methods of `vector`). – Cornstalks Dec 24 '13 at 20:48

3 Answers3

3

Different container's iterators have different properties. Here are the iterator invalidation rules.

The list loop: When you push onto a list all previous iterators still valid. You will never hit the end if every time you iterator forward one you also add a new element, obviously.

The vector loop: For a vector, your iterators are invalid once a push_back results in the new size exceeding the old capacity. As soon as this happens, using iter is undefined behavior (you will likely crash).

I thought that when the vector container must be resized, it does so at powers of 2 and is located to a new area of memory

This is unspecified by the standard. Some implementations of the C++ standard library double the capacity of a vector when the size exceeds the old capacity, and others grow at different rates.

Community
  • 1
  • 1
David
  • 27,652
  • 18
  • 89
  • 138
  • @Dave What do you mean by "implementation"? I'm using Visual C++ 2010 Express, I assume this is different than if I used emacs? – bcf Dec 24 '13 at 20:59
  • @David emacs is just an editor. I'm talking about implementations of the standard library. VS2010 comes with one. emacs doesn't. Microsoft had the option to grow vector's capacity however they wanted. It's not specified in the standard on how to grow it. Different implementations could grow at different rates. – David Dec 24 '13 at 21:01
  • @David: the "implementation" is the C++ compiler and its standard C++ library. Visual Studio C++ comes with a C++ compiler. Emacs doesnt. There are multiple different compilers, e.g., [gcc](http://gcc.gnu.org/) and [clang](http://clang.llvm.org/) are popular open-source compilers. – Dietmar Kühl Dec 24 '13 at 21:06
  • @DietmarKühl You would know better than me, but the semantic difference between 'implementation defined' and 'unspecified' which you're getting at seems pretty hazy - can you elaborate? They mean the same thing to me. – David Dec 24 '13 at 21:07
  • @Dave: "implementation defined" means that the implementation's documentation needs to explicitly document what it has chosen to do and these choices become part of the contract. It is just that the C++ standard didn't make any choice (see, e.g., [gcc's doc](http://gcc.gnu.org/onlinedocs/gcc/C_002b_002b-Implementation.html)). "unspecified" means that the implementation needs to do something sensible and there may be different choices yielding different results but the choices are not documented and won't be guaranteed. A DS9k implementation could choose to make a different choice each time. – Dietmar Kühl Dec 24 '13 at 21:16
  • @DietmarKühl Could 'a different choice each time' not just be part of the implementation defined contract? If something is 'unspecified' in the standard, and implementation _must_ defined it, itself. If something is either `unspecified by the standard` or `implementation defined`, the other must be implied to be true... no? It seems like the difference you speak of is just a matter of documentation? – David Dec 24 '13 at 21:21
  • @Dave: The problem is that certain behavior depends on many aspects and although the implementation probably has a concrete algorithm how it processes the code, it may be too complex to be defined. Also, implementation defined means that there is a guarantee and the documentation changes when the guarantee changes. The same is not true for unspecified behavior. ... and there is certainly no mandate to describe what the unspecified behavior actually is. For example, there is no mandate to describe the exact growth strategy for `std::vector` in a user accessible form. – Dietmar Kühl Dec 24 '13 at 21:26
0

The answer on your first question is contained in the second your question.

As for the second question then it is implementation defined how the vector allocates the memory. It is not necessary that it will double the size of the memory each time when it is exhausted.

Vlad from Moscow
  • 301,070
  • 26
  • 186
  • 335
0

The different containers generally have different guarantees with respect to the validity of iterators, and pointers/references to elements:

  1. For std::list<T> the iterators and pointers/references to elements stay valid until the corresponding node gets erased or the std::list<T> exists.
  2. For std::vector<T> the validity is more complicated:
    1. Iterator and pointer/reference validity is identical (and I'll only use iterators below).
    2. All iterators are invalidated when the std::vector<T> needs to resize its internal buffer, i.e., when inserting exceeds the capacity. When the capacity is exceeded and how much memory is allocated depends on the implementation (the only requirement is that the capacity grows exponentially and a factor of 2 is a reasonable choice but there are many others).
    3. When inserting into a std::vector<T> all iterators before the insertion point stay valid unless reallocation is necessary.
    4. When erasing from a std::vector<T> all iterators past the erase point are invalidated.

Other containers have, yet, different validity constraints (e.g. std::deque<T> keeps invalidating iterators but can keep pointers/references valid).

Dietmar Kühl
  • 150,225
  • 13
  • 225
  • 380