2

I have code that looks like the below:

std::unordered_set<int> ht{1,2,3};
ht.reserve(10000);  // ht will not exceed this size

for(int i = 0; i < n; i++)
{ 
  auto j = i;
  for(auto it = ht.begin(); it != ht.end(); ++it)
  {
    // do some stuff
    int v = j++;
    ht.emplace(v);
  }
}

For the inner loop, I want to loop from the beginning of ht to the end, but I don't want the loop to go over any of the newly added elements within the loop. In other words, is the above equivalent to the below?

std::unordered_set<int> ht{1,2,3};
ht.reserve(10000);  // ht will not exceed this size

for(int i = 0; i < n; i++)
{
  auto temp = ht;
  auto j = i;
  for(auto it = ht.begin(); it != ht.end(); ++it)
  {
    // do some stuff
    auto v = j++;
    temp.emplace(j);
  }

  ht = temp;
}

Based on a few runs that I did, it seems to be equivalent, but I don't know if this is undefined behavior, or if they are indeed equivalent. If the unordered_set was changed to a vector, this would not work, but it seems the forward iterators work.

Does the answer change if the ht.reserve(10000); // ht will not exceed this size was not present or if ht did in fact exceed the reserved capacity, and therefore all the forward iterators will be invalidated?

24n8
  • 1,898
  • 1
  • 12
  • 25
  • [std::unordered_set::reserve](https://en.cppreference.com/w/cpp/container/unordered_set/reserve) does not do the same thing as `std::vector::reserve`. It is far more than just allocating space. Also -- *Based on a few runs that I did,* -- C++ does not work this way, where you can verify (or not verify) that something is safe by a few runs. It's either undefined behavior, defined behavior, or implementation defined behavior. – PaulMcKenzie May 24 '20 at 06:08
  • Does this answer your question? [Simultaneously iterating over and modifying an unordered\_set?](https://stackoverflow.com/questions/13981886/simultaneously-iterating-over-and-modifying-an-unordered-set) – Jan Schultke May 24 '20 at 06:12
  • @J.Schultke I think so, but I just wanted to confirm that my understanding is correct. Based on reading GManNickG's answer, it seems that even if we guarantee no rehashing, we could still iterate over the newly added elements? – 24n8 May 24 '20 at 06:51
  • @PaulMcKenzie Yes, I agree on both counts. But `std::unordered_set::reserve(N)` would guarantee that the unordered_set can store at least `N` numbers without rehashing correct? It guarantees it can store exactly `N` numbers if there are no collisions. – 24n8 May 24 '20 at 07:19

2 Answers2

1

No, it's not safe:

On most cases, all iterators in the container remain valid after the insertion. The only exception being when the growth of the container forces a rehash. In this case, all iterators in the container are invalidated.

Sometimes it works, but I don't think this is enough for you!

Costantino Grana
  • 3,132
  • 1
  • 15
  • 35
  • I think the OP’s question is about whether the newly-added elements will be seen during the iterator traversal rather than about invalidation; they’ve used reserve such that a rehash won’t occur. – templatetypedef May 24 '20 at 06:03
  • @templatetypedef Yes, that's correct. I am essentially asking the first part of my question based on assuming no rehashing occurs (due to the reserve statement and not exceeding the capacity) – 24n8 May 24 '20 at 06:04
  • Sorry, I missed the reserve. – Costantino Grana May 24 '20 at 06:17
0

No. See cppreference on std::unordered_set.

Cppreference.com has Iterator invalidation sections which describe when an iterator is invalidated. In the case of using std::unordered_set, inserting is not safe when a rehash occurs.

Rehashing occurs only if the new number of elements is greater than max_load_factor()*bucket_count()

And you can not know for certain whether inserting an element will cause that to happen.

In your example, you don't actually dereference the iterator, so why not loop over the size of the set?

size_t limit = ht.size();
for (size_t i = 0; i < limit; ++i) {
    ...
}
Jan Schultke
  • 17,446
  • 6
  • 47
  • 96
  • I think the OP’s question is about whether the newly-added elements will be seen during the iterator traversal rather than about invalidation; they’ve used reserve such that a rehash won’t occur. – templatetypedef May 24 '20 at 06:03
  • 1
    Right, I understand all the iterators will be invalidated if rehashing occurs, but that's why I included the `reserve` and the associated comment. – 24n8 May 24 '20 at 06:03