6

When insert-ing into a std::vector the C++ standard assures that all iterators before the insertion point remain valid as long as the capacity is not exhausted (see [23.2.4.3/1] or std::vector iterator invalidation).

What is the rationale behind not allowing iterators after the insertion point to remain valid (if the capacity is not exhausted)? Of course, they would then point to a different element but (from the presumed implementation of std::vector) it should still be possible to use such an iterator (for example dereference it or increment it).

Community
  • 1
  • 1
levzettelin
  • 2,600
  • 19
  • 32

4 Answers4

9

You seem to be thinking of an "invalid" iterator as only one that would provoke a crash if used, but the standard's definition is broader. It includes the possibility that the iterator can still safely be dereferenced, but no longer points to the element it is expected to point to. (This is a special case of the observation that "undefined behavior" does not mean "your program will immediately crash"; it can also mean "your program will silently compute the wrong result" or even "nothing observably wrong will occur on this implementation.")

It is easier to demonstrate why this is an issue with erase:

#include <vector>
#include <iostream>
int main(void)
{
    std::vector<int> a { 0, 1, 2, 3, 4, 4, 6 };

    for (auto p = a.begin(); p != a.end(); p++) // THIS IS WRONG
        if (*p == 4)
            a.erase(p);

    for (auto p = a.begin(); p != a.end(); p++)
        std::cout << ' ' << *p;

    std::cout << '\n';
}

On typical implementations of C++ this program will not crash, but it will print 0 1 2 3 4 6, rather than 0 1 2 3 6 as probably intended, because erasing the first 4 invalidated p -- by advancing it over the second 4.

Your C++ implementation may have a special "debugging" mode in which this program does crash when run. For instance, with GCC 4.8:

$ g++ -std=c++11 -W -Wall test.cc && ./a.out
 0 1 2 3 4 6

but

$ g++ -std=c++11 -W -Wall -D_GLIBCXX_DEBUG test.cc && ./a.out
/usr/include/c++/4.8/debug/safe_iterator.h:307:error: attempt to increment 
    a singular iterator.

Objects involved in the operation:
iterator "this" @ 0x0x7fff5d659470 {
type = N11__gnu_debug14_Safe_iteratorIN9__gnu_cxx17__normal_iteratorIPiNSt9__cxx19986vectorIiSaIiEEEEENSt7__debug6vectorIiS6_EEEE (mutable iterator);
  state = singular;
  references sequence with type `NSt7__debug6vectorIiSaIiEEE' @ 0x0x7fff5d659470
}
Aborted

Do understand that the program provokes undefined behavior either way. It is just that the consequences of the undefined behavior are more dramatic in the debugging mode.

zwol
  • 135,547
  • 38
  • 252
  • 361
  • That is exactly the question: Is the code that you gave standard conforming C++ (although it might not do what one expects)? Or rather, why is it not standard conforming? My mental picture of C++ unfortunately suggests that your code should validly give the output that you showed. – levzettelin Jul 27 '13 at 14:20
  • @TobiasBrüll The code above executes undefined behavior. The compiler would be perfectly standard conforming if it formatted your hard drive after emailing your bank information to Australia, although that would not be in line with best compiler practices. The output Zack got is definitely not guaranteed by the standard, *but it is permitted*. On the other hand, the particular implementation of `std::vector` the compiler uses might guarantee the above behavior in actual practice (either formally, in the docs (unlikely), or because the source code to `std::vector` is exposed). – Yakk - Adam Nevraumont Jul 27 '13 at 15:09
  • @TobiasBrüll Yakk is correct. See edit for further exposition. – zwol Jul 27 '13 at 17:21
3

A vector grows dynamically, so when you push onto a vector, if there is no space for the item, memory needs to be allocated for it. The Standard mandates that vector must store its elements in contiguous memory, so when memory is allocate, it has to be enough to store ALL the existing elements, plus the new one.

The vector doesn't know about any iterators for itself, so cannot update them into the new storage of elements. Iterators are therefore invalid after the memory has been reallocated.

cdmh
  • 3,294
  • 2
  • 26
  • 41
  • With reallocations it is another issue. I was talking about the case were there is enough capacity left. – levzettelin Jul 27 '13 at 14:01
  • The behavior needs to be deterministic and consistent. A typical use case is not to know or care whether capacity>used or not. – cdmh Jul 27 '13 at 16:20
3

That the iterators may refer to a different element is enough for them to be invalidated. An iterator is supposed to refer to the same element for the duration of its valid lifetime.

You're right that, in practice, you may not experience any crashing or nasal demons if you were to dereference such an iterator, but that does not make it valid.

Lightness Races in Orbit
  • 378,754
  • 76
  • 643
  • 1,055
  • You say that by the C++ standard dereferencing such an iterator does not lead to undefined behavior? – levzettelin Jul 27 '13 at 14:06
  • 1
    @TobiasBrüll no, by the C++ standard it does lead to undefined (by the C++ standard) behavior. The most common undefined behavior, if capacity is not changed, is that you get the element that now has the same offset that your iterator last referred to, in actual practice. But rather than add verbage about "iterators that change what they refer to" which would tie down what implementation is used and make the standard harder to read (heh), they basically say iterators either always refer to the same thing, or become invalid. There are optimizations a compiler can do because of this, in theory. – Yakk - Adam Nevraumont Jul 27 '13 at 15:05
  • 1
    @TobiasBrüll: No, it is still undefined behaviour. That doesn't mean you get a crash. – Lightness Races in Orbit Jul 27 '13 at 19:44
1

The vector does not know which iterators exist. Yet the memory location of the elements after the inserted element changed. This means, the iterators need to be updated to reflect that change should they remain valid. But the vector cannot do this update, because it does not know which iterators exist.

Oswald
  • 31,254
  • 3
  • 43
  • 68