0
template <typename T>
int eliminate_duplicate(vector<T> &A) {
  sort(A.begin(), A.end());  // makes identical elements become neighbors
  auto it = unique(A.begin(), A.end());  // removes neighboring duplicates
  A.resize(it - A.cbegin());  // truncates the unnecessary trailing part
  return it - A.cbegin(); // Question> Is this line valid?
}

Is the last line valid?

Here is my concern: after the calling of resize, the iterator it will point to a invalidate location, so can we still use it as it - A.cbegin() in the return line?

Yakk - Adam Nevraumont
  • 262,606
  • 27
  • 330
  • 524
q0987
  • 34,938
  • 69
  • 242
  • 387
  • Why not simply store the result in a variable *before* you resize? – Some programmer dude Jul 02 '13 at 14:35
  • 2
    What you should do it `A.erase(it, A.end())`. – juanchopanza Jul 02 '13 at 14:36
  • How about `A.erase( unique( A.begin(), A.end() ), A.end() )`? – Ylisar Jul 02 '13 at 14:37
  • If the `resize` causes a reallocation, then any existing iterators are invalidated... – Oliver Charlesworth Jul 02 '13 at 14:37
  • 2
    It's not valid. Invalidated iterators are invalidated. – zch Jul 02 '13 at 14:37
  • I don't know if it's an guarantee, but I would be really surprised if resize() would do any sort of re allocation when shrinking. – Ylisar Jul 02 '13 at 14:38
  • http://stackoverflow.com/questions/1155693/stdvector-resize-downward -- why do you think resizing with a smaller size could invalidate? Ah, because `it` now points to past-the-end without being created via calling `end()`. – Yakk - Adam Nevraumont Jul 02 '13 at 14:39
  • This is not my code and please simply focus on whether or not it is correct. – q0987 Jul 02 '13 at 14:41
  • Another way to phrase this question: if you erase the last element of a `std::vector`, does the iterator that referred to it now a past-the-end iterator? I think the answer is "no, but many implementations will quite often give that result" -- and that is strong enough that you could probably add it to the next version of the standard and only the debugging version of `std::vector` iterators would have to be changed! (as, under-the-hood, the easy way to implement a fast `vector` iterator is to be a gussied up pointer) – Yakk - Adam Nevraumont Jul 02 '13 at 14:46

1 Answers1

3

From section 23.3.6.3 vector capacity of the C++11 standard (draft n3337), clause 9 (the bolded text is my emphasis as it is the case in the posted code that sz <= size() is true):

void resize(size_type sz);

Effects: If sz <= size(), equivalent to erase(begin() + sz, end());. If size() < sz, appends sz - size() value-initialized elements to the sequence.

and from section 23.3.6.5 vector modifiers, clause 3:

iterator erase(const_iterator position);
iterator erase(const_iterator first, const_iterator last);

Effects: Invalidates iterators and references at or after the point of the erase.

begin() + sz is equal to it, therefore it is invalidated.

To correct, just return A.size() (if the caller really requires it as that information is available from A anyway). Suggest using A.erase(it, A.end()); (as commented by juanchopanza) as the intent of the code is clearer.

Community
  • 1
  • 1
hmjd
  • 120,187
  • 20
  • 207
  • 252
  • On the other hand, it's guaranteed to be past-the-end (i.e. end()), and you are allowed to do end()-begin(). – DanielKO Jul 02 '13 at 15:44
  • @DanielKO, not sure I follow as in `end() - begin()` there is no reference to a previously stored `iterator`? – hmjd Jul 02 '13 at 15:47
  • As vector is guaranteed to be contiguous, and data is not moved when shrinking, "it" has to point to where a push_back() will emplace the next item (if there's capacity for it), so it's essentially the same as end(). But as the standard is over-conservative (it erase() should invalidate references at and after, but invalidate iterators only strictly after, allowing erase(end()) to be a no-op) I think an implementation could go out of its way to make sure that line breaks (like storing a few flags and such). You are correct. – DanielKO Jul 02 '13 at 16:28