14

It is commonly understood that a good way to fully delete desired items from a std::vector is the erase-remove idiom.

As noted in the above link (as of the date of this posting), in code the erase-remove idiom looks like this:

int main()
{
  // initialises a vector that holds the numbers from 0-9.
  std::vector<int> v = { 0, 1, 2, 3, 4, 5, 6, 7, 8, 9 };

  // erase-remove idiom to completely eliminate the desired items from the vector
  v.erase( std::remove( std::begin(v), std::end(v), 5 ), std::end(v) ); 
}

I would like to know whether a resize-remove idiom is equivalent in terms of functionality and performance to the erase-remove idiom. Or, perhaps I am missing something obvious?

Is the following resize-remove idiom equivalent to the above erase-remove idiom?

int main()
{
  std::vector<int> v = { 0, 1, 2, 3, 4, 5, 6, 7, 8, 9 };

  // Is this "resize-remove" approach equivalent to the "erase-remove" idiom?
  v.resize( std::remove( std::begin(v), std::end(v), 5 ) - v.begin() ); 
}
Dan Nissenbaum
  • 13,558
  • 21
  • 105
  • 181
  • 6
    "resize-remove" is - by definition - not an "idiom" as the phrase is not in use by others... – Tony Delroy Apr 04 '14 at 10:46
  • @TonyD Which is, by itself, a very strong reason for preferring the erase-remove idiom. – James Kanze Apr 04 '14 at 12:57
  • 1
    The (rhetorical) question I'd like to add is: why on earth does the STL not take care of this chore on its own by default, so that it couldn't have become an idiom in the first place... – Sz. Feb 04 '22 at 02:52

4 Answers4

13

In my opinion, there are two reasons:

  1. std::remove algorithm requires only Forward Iterator, but - op requires Random Access Iterator.

  2. The result of std::remove means "the new end of container". Logically, we should erase [ "the new end of container" , "the old end of container" ).

ikh
  • 10,119
  • 1
  • 31
  • 70
6

It is equivalent for std::vector, but not for std::list or other containers. Not sure if subtracting iterators is even possible for std::list, and even if it is, it is a O(N) operation.

green lantern
  • 1,086
  • 7
  • 10
  • 1
    It's not possible. You'd have to use `std::distance`, which is O(N). – Sebastian Redl Apr 04 '14 at 10:45
  • Then again, it's O(N) in the elements to remove, plus it will bring those elements into cache. So practically it's not horrible. – MSalters Apr 04 '14 at 12:49
  • @MSalters: It is the other way around. It is O(N) in the elements NOT to remove. The overall complexity of the idiom will not change, because `std::remove()` already is O(N), but in practice constants matter too. – green lantern Apr 04 '14 at 14:46
  • @greenlantern: You'd iterate backwards from end(), if only because of the cache issue. – MSalters Apr 04 '14 at 15:35
  • @MSalters: You mean changing the code to `v.resize(v.size() - std::distance(std::remove(...), v.end()))`? Yes, that would be a little less horrible :-) – green lantern Apr 04 '14 at 16:16
  • @greenlantern: That was pretty much a given since `operator-` isn't defined on `list::iterator`. – MSalters Apr 04 '14 at 16:20
  • 1
    @MSalters: You could use the original code and just change `operator-` to `std::distance`: `v.resize(std::distance(v.begin(), std::remove(...)));`. Btw. notice how the erase-remove idiom and the code in my previous comment both contain the `std::remove(...), v.end()` part, but with erase-remove the rest is much simpler. – green lantern Apr 04 '14 at 16:43
3

It shouldn't make any difference; resize is defined in terms of insert and erase. But it is usually preferable to use the standard idiom, so that it can easily be recognized. And of course, the erase-remove idiom will work with any sequence container, and not just those which support resize. (All of the standard containers do seem to support resize, but it doesn't seem to be a requirement. So it might not be available on user defined containers, even though they support all required operations.)

In terms of performance: the resize must do one additional test, to determine whether it is erasing or inserting, but I can't imagine this making a significant impact.

James Kanze
  • 150,581
  • 18
  • 184
  • 329
  • `resize` on a `std::list` has the additional drawback that it has to find the new end of the list, by iterating O(n), while that exact end iterator returned by `remove` has been thrown away after calculating `X-begin(), which is again O(n). – Arne Mertz Apr 04 '14 at 11:54
  • @ArneMertz: `std::list` is bidirectional (double linked). It's pretty cheap to find the new end: you just pop the current end until the size is right. The new end is the predecessor of the last popped element. – MSalters Apr 04 '14 at 12:52
  • @ArneMertz I was wondering about that. I didn't expect to even find it, because it can't be implemented efficiently. (And of course, you can't do the different of the iterators directly if they aren't random access either.) – James Kanze Apr 04 '14 at 12:53
  • @MSalters You're right, but then why does the standard describe it in terms of `advance` and `erase`? (Does `erase` guarantee the order in which the destructors are called? I don't think so, but if so, your implementation wouldn't call them in the correct order.) – James Kanze Apr 04 '14 at 12:57
  • Your answer and @ikh's are both enlightening - I wish they could be combined, as they make some different points. – Dan Nissenbaum Apr 04 '14 at 13:08
  • @DanNissenbaum And TonyD also made an important point in his comment. Nothing surprising: different people think in different ways, and the most complete answer to any non-trivial question will probably involve answers by several different people. – James Kanze Apr 04 '14 at 13:12
1

I think erase in erase(first,last) guarantees that no elements before first are accessed or modified, while resize only guarantees this when no reallocation happens because of the resize.

Edit: as pointed out by others, such a reallocation will never happen, so there's no difference

Pieter
  • 17,435
  • 8
  • 50
  • 89