5

Consider the following code:

std::set<int> s;
auto it = s.begin();
s.insert(1);
s.insert(2);
std::cout << *it << std::endl;

The output (at least for me) is 2. What's happening here? What's the state of it when I dereference it?

I know that when I call begin() on an empty set, I get an iterator equivalent to end(). I also know that calling insert on a set doesn't invalidate its iterators. Does the iterator somehow stay equivalent to end() even though I've now inserted elements into the set and so now I'm getting undefined behaviour? Is that defined by the standard?

Joseph Mansfield
  • 108,238
  • 20
  • 242
  • 324
  • I believe an iterator is invalidated if you change the container. So this is UB. – andre Nov 16 '12 at 21:35
  • @ahenderson Not true, you're probably thinking of vectors. – john Nov 16 '12 at 21:35
  • According to [this](http://stackoverflow.com/questions/6438086/iterator-invalidation-rules) the iterators are not invalidated. – Xymostech Nov 16 '12 at 21:36
  • @Xymostech: This is actually a corner case. The standard only guarantees that iterators that *refer to an element* in the container are not invalidated. This is not the case, as when the `begin()` function was called the container was empty, and thus that iterator does **not** refer to an element in the container. There is no guarantees in the standard regarding the `end()` iterator of a container. – David Rodríguez - dribeas Nov 16 '12 at 21:39
  • "*Does the iterator somehow stay equivalent to end()*" - you can determine this experimentally thus: `std::cout << (it == s.end());` – Robᵩ Nov 16 '12 at 21:39
  • @DavidRodríguez-dribeas Your comment seems to counter James's answer below. Is it wrong? – Joseph Mansfield Nov 16 '12 at 21:42
  • @Robᵩ But his compiler could be wrong, no? – Seth Carnegie Nov 16 '12 at 21:42
  • @DavidRodríguez-dribeas That's interesting. Another case of the standard being unhelpful to people wanting to write code. – john Nov 16 '12 at 21:42
  • @SethCarnegie - yes, it could be. But it isn't. In any event, your point is well taken. One must be careful with empirical evidence. Such evidence has value, but it is not authoritative. – Robᵩ Nov 16 '12 at 21:45
  • @john: Not really, it just leaves freedom for different implementations of the end iterator. Some implementations can decide to use a null pointer, or it can use a fake node inside the `std::set` object. The guarantee is that objects inserted into the container won't be *moved* around memory, which is an important guarantee. The whole *I get a pointer to `begin()` at startup (which is actually `end()`) and expect it to be a valid iterator some time later* is the actual issue. – David Rodríguez - dribeas Nov 16 '12 at 21:57
  • @DavidRodríguez-dribeas But if I'm understanding you right you are saying that, for instance, I cannot maintain a half open range on a set with a pair of iterators. Because the second of those iterators might be pointing to the end, and therefore might get invalidated by an insertion to the set. That seems a serious issue to me, but maybe I've misunderstood. – john Nov 16 '12 at 22:21
  • @john: That is true, holding a half open range that involves `end()` is not guaranteed if there are insertions into the set. The question is what would you expect in that situation, if you expect the new elements to be included in the range, then you range is also *open in time* and you should not maintain the range, but test `end()` on each use. If you expect the range to be stable then even if `end()` did not become invalidated you would not get what you wish for. This needs to be solved at a higher level than the `std::set`. Note: the issue is still there even with iterators into elements. – David Rodríguez - dribeas Nov 16 '12 at 22:29
  • @DavidRodríguez-dribeas: No, associative container end iterators are not invalidated by insertions. n3485 §23.2.4[associative.reqmts]/9 says that for associative containers, "the `insert` and `emplace` members shall not affect the validity of iterators and references to the container, and the `erase` members shall invalidate only iterators and references to the erased elements." The phrase "iterators to the container" includes end iterators. Contrast this with container `swap` specification, which has the "iterator referring to an element" language. – James McNellis Nov 16 '12 at 22:34
  • @DavidRodríguez-dribeas Not understanding why you've suddenly introduced threads. But this is inconvenient, yes I would expect inserted elements to be included in the range. But I cannot use a pair of iterators for that range, I have to add the third value, a boolean or something to indicate that the range goes to the end of the set. Incidentally I don't see how either of the implementations you suggest would result in invalidation in practice. – john Nov 16 '12 at 22:35
  • @JamesMcNellis: What is the definition of *iterators **to** the container*? `end()` is an iterator *one beyond* the container. That is the issue, `end()` does not fall inside that clause as it is not an iterator **to** the container. I recall having this discussion with some of the committee members (and in particular Alisdair Meredith, who is the author the note you mention in `swap`). I will have to bring this up again to see if I misunderstood what was then discussed. – David Rodríguez - dribeas Nov 16 '12 at 22:50
  • @john: Ignore threads, as long as there is any insertion into the set, which can be through other threads, or in the one thread, it does not really matter (well, it matters as there are issues with locking, but not to this discussion). – David Rodríguez - dribeas Nov 16 '12 at 23:00
  • @DavidRodríguez-dribeas I think this is one of the cases where, if the standard really says what you are saying, then I would simply ignore it. I have never come across any standard library implementation that does invalidate end iterators in this way. – john Nov 16 '12 at 23:10
  • 1
    @JamesMcNellis: After rediscussing this with Alisdair his stance is that the wording is not precise enough, but the guarantee is not there. In particular the corner case where the `end()` iterator can be invalidated on insertion is in some implementations that have a sentinel that is dynamically allocated with the first element inserted into the container. On those implementations, on an empty container `end()` is an iterator with a null pointer, but after the first insertion `end()` is an iterator with a pointer to the sentinel object. – David Rodríguez - dribeas Nov 16 '12 at 23:12
  • @DavidRodríguez-dribeas: Would you or Alisdair open an LWG DR for this? While the current language is certainly lacking clarity, I don't see how the requirement is _not_ there, given the current phrasing, since it's missing the magic words "referring to an element." The committee should clarify the wording (whichever way they see fit). – James McNellis Nov 17 '12 at 07:19

3 Answers3

9

When you call s.begin() here, it returns an end iterator because the container is empty. This iterator is not invalidated by the insertions: after each insertion, this iterator is still an end iterator.

Dereferencing this iterator causes your program to exhibit undefined behavior (end iterators cannot be dereferenced).

James McNellis
  • 348,265
  • 75
  • 913
  • 977
  • I totally didn't apply the fact that iterators aren't invalidated to the end iterator. I figured "well I'm inserting stuff so what guarantees that the end iterator isn't changing?" Well duh, it isn't invalidated! Thanks – Joseph Mansfield Nov 16 '12 at 21:39
  • The iterator is **not** guaranteed to be the *end* iterator after the insertions. The guarantees in the standard are only for pointers, references and iterators into *elements in the container*. `end()` does not refer to any element in the container and thus there is no such guarantee. – David Rodríguez - dribeas Nov 16 '12 at 22:42
  • @DavidRodríguez-dribeas: As quoted from [here](http://stackoverflow.com/questions/13080013/does-the-value-of-stdlisttend-change-after-modifying-list): `The language specification doesn't state it explicitly. However, there's simply no valid operation on the list that would invalidate the end iterator or associate its value with some other location.` (the answer is for `std::list`, but the same should be true for `std::set` as well.) – Jesse Good Nov 16 '12 at 22:48
  • @DavidRodríguez-dribeas: The specification does not say "iterators into elements in the container," it says "iterators to the container." An end iterator refers to a container, even though it does not refer to an element in the container. – James McNellis Nov 16 '12 at 23:06
  • @JamesMcNellis: See the comments in the question. `end()` is not an iterator *to* the container, it is an iterator *one beyond* the container. – David Rodríguez - dribeas Nov 16 '12 at 23:26
1

The iterator it is still equal to s.end(). Dereferencing an end iterator is undefined behaviour. That's what you are seeing. Try this for example

if (it == s.end())
{
    cout << "at end\n";
}

This is legal code.

john
  • 85,011
  • 4
  • 57
  • 81
0

"it" in the first case points to s.end() which is the address of the fantom element. The address of the fantom element is nothing but address of the "non-existing" element after the last element in the container. Once the set is modified, "it" which initially pointed to s.end() before the insertion may not be the same as s.end() after the insertion.

So, dereferencing "it" would result in undefined behavior.