2

I have this C++ code file which is FOSS under the Expat licence. When running cppcheck on the code I get this error:

[google_hash.cpp:137] -> [google_hash.cpp:141]: (error) Iterator 'it' used after element has been erased.

The code in question is:

    74  #if (FCS_WHICH_STATES_GOOGLE_HASH == FCS_WHICH_STATES_GOOGLE_HASH__SPARSE)
    75  typedef sparse_hash_set<char *, state_hash, state_equality> StatesGoogleHash;
    76  #else
    77  typedef dense_hash_set<char *, state_hash, state_equality> StatesGoogleHash;
    78  #endif
.
.
.
   131  extern void fc_solve_states_google_hash_foreach(
   132      fcs_states_google_hash_handle void_hash,
   133      bool (*should_delete_ptr)(void *key, void *context), void *context)
   134  {
   135      StatesGoogleHash *hash = (StatesGoogleHash *)void_hash;
   136
   137      for (StatesGoogleHash::iterator it = hash->begin(); it != hash->end(); ++it)
   138      {
   139          if (should_delete_ptr(*(it), context))
   140          {
   141              hash->erase(it);
   142          }
   143      }
   144  }

It makes use of either http://goog-sparsehash.sourceforge.net/doc/sparse_hash_set.html or http://goog-sparsehash.sourceforge.net/doc/dense_hash_set.html .

Now, these documents say that

Validity of Iterators

insert() invalidates all iterators, as does resize().
erase() is guaranteed not to invalidate any iterators.

So my question is - am I using the sets' iterators correctly and safely (and cppcheck has emitted a false positive), or if not - how should the code be fixed?

Help will be appreciated.

Shlomi Fish
  • 4,380
  • 3
  • 23
  • 27
  • 3
    According to the documentation it seems you are. `it` is an iterator on `hash` and `hash->erase(it);` is said to invalidate no iterator so `it` should not be invalidated. Though perhaps the documentation is incorrect and it meant to say that iterator to other elements is invalidated. Or perhaps there is an expectation that it's implied that `it` doesn't count as part of "any iterators". Note that `erase` is tagged with note 6 which seems to be missing from the documentation and may be what was intended to cover this. – François Andrieux Jan 29 '19 at 14:55
  • 1
    Even if `erase()` doesn't invalidate `it`, it's not clear what `it++` is meant to do after you've erased it's element. It's definitively not an iterator to an element of the container, so what does it represent? Does it implicitly refer to the element after `it`? The one before? In either case, I would argue that it's not quite right to say it isn't invalidated. It certainly can't still refer to the same thing. If it doesn't refer to a specific element at all, then what operations does it support? Such an iterator would require it's own category definition really. – François Andrieux Jan 29 '19 at 14:57
  • 4
    It's an odd interface for something that calls itself a container anyway. I would expect an `erase(iterator)` function to return the next iterator, not `void`. – Baum mit Augen Jan 29 '19 at 14:57
  • @FrançoisAndrieux it's fairly normal to have *one* iterator value that one can't dereference, the `end` iterator. It seems that this api can have multiple valid-but-not-dereferenceable values. – Caleth Jan 29 '19 at 16:39
  • @Caleth That's a good point. But it's the only iterator value in a range that can't be dereferenced and the concept of one-past-the-end iterators is assumed to be understood by the user as it's analogous to the standard concept of one-past-the-end pointers. The `begin` and `end` are also special in that they define a range. Again it's relatively safe to assume the developer will understand the restrictions on them because if a user understands iterators they must understand ranges. This valid non-`end` yet undereferencable iterator would have no standard precedent and *needs* documentation. – François Andrieux Jan 29 '19 at 16:50
  • @Caleth For example, what should `(it++)--` produce? Probably not the original `it`. Is it safe? Without an explanation of what `erase` does to `it` it's hard to know what you can do with `it`. Saying it's not invalidated is not enough in my opinion. But you're right, it looks like this API allows multiple valid undereferencable iterators in the middle of the container. – François Andrieux Jan 29 '19 at 16:53

1 Answers1

3

It's a false positive in the sense that cppcheck thinks that an API called erase on an iterator invalidates the iterator. That's the rule that it uses, because that's a sensible API.

Then for this code, it seems to be valid to do ++it because of this:

This is implemented by making erase() not resize the hashtable. If you desire maximum space efficiency, you can call resize(0) after a string of erase() calls, to force the hashtable to resize to the smallest possible size.

This seems to mean that the iterators may encounter deleted objects if you don't call resize(0) between erasures. Here it's fine because erasures are done in the same loop.

The fact that this code hides its behavior by not making it obvious and doesn't use standard patterns doesn't help.

Baum mit Augen
  • 49,044
  • 25
  • 144
  • 182
Matthieu Brucher
  • 21,634
  • 7
  • 38
  • 62