67

I am bit confused about the difference between the usage of std::remove algorithm. Specifically I am not able to understand what is being removed when I use this algorithm. I wrote a small test code like this:

std::vector<int> a;
a.push_back(1);
a.push_back(2);

std::remove(a.begin(), a.end(), 1);


int s = a.size();

std::vector<int>::iterator iter = a.begin();
std::vector<int>::iterator endIter = a.end();

std::cout<<"Using iter...\n";
for(; iter != endIter; ++iter)
{
    std::cout<<*iter<<"\n";
}

std::cout<<"Using size...\n";
for(int i = 0; i < a.size(); ++i)
{
    std::cout<<a[i]<<"\n";
}

The output was 2,2 in both the cases.

However, if I use erase with the remove something like this:

a.erase(std::remove(a.begin(), a.end(), 1), a.end());

I get the output as 2.

So my questions are:

(1). Is there any use of std::remove other than using it with erase function.

(2). Even after doing std::remove, why a.size() returns 2 and not 1?

I read the item in Scott Meyer's Effective STL book about the erase-remove idiom. But am still having this confusion.

aJ.
  • 34,624
  • 22
  • 86
  • 128
Naveen
  • 74,600
  • 47
  • 176
  • 233

7 Answers7

67

remove() doesn't actually delete elements from the container -- it only shunts non-deleted elements forwards on top of deleted elements. The key is to realise that remove() is designed to work on not just a container but on any arbitrary forward iterator pair: that means it can't actually delete the elements, because an arbitrary iterator pair doesn't necessarily have the ability to delete elements.

For example, pointers to the beginning and end of a regular C array are forward iterators and as such can be used with remove():

int foo[100];

...

remove(foo, foo + 100, 42);    // Remove all elements equal to 42

Here it's obvious that remove() cannot resize the array!

j_random_hacker
  • 50,331
  • 10
  • 105
  • 169
34

What does std::remove do?

Here's pseudo code of std::remove. Take few seconds to see what it's doing and then read the explanation.

Iter remove(Iter start, Iter end, T val) {
    Iter destination = start;

    //loop through entire list
    while(start != end) { 
        //skip element(s) to be removed
        if (*start == val) { 
            start++; 
         }
         else //retain rest of the elements
             *destination++ = *start++;
     }

     //return the new end of the list
     return destination;
}

Notice that remove simply moved up the elements in the sequence, overwriting the values that you wanted to remove. So the values you wanted to remove are indeed gone, but then what's the problem? Let say you had vector with values {1, 2, 3, 4, 5}. After you call remove for val = 3, the vector now has {1, 2, 4, 5, 5}. That is, 4 and 5 got moved up so that 3 is gone from the vector but the size of vector hasn't changed. Also, the end of the vector now contains additional left over copy of 5.

What does vector::erase do?

std::erase takes start and end of the range you want to get rid off. It does not take the value you want to remove, only start and end of the range. Here's pseudo code for how it works:

erase(Iter first, Iter last)
{
    //copy remaining elements from last
    while (last != end())
        *first++ = *last++;

   //truncate vector
   resize(first - begin());
}

So the erase operation actually changes the size of container and frees up the memory.

The remove-erase idiom

The combination of std::remove and std::erase allows you to remove matching elements from the container so that container would actually get truncated if elements were removed. Here's how to do it:

//first do the remove
auto removed = std::remove(vec.begin(), vec.end(), val);

//now truncate the vector
vec.erase(removed, vec.end());

This is known as the remove-erase idiom. Why is it designed like this? The insight is that the operation of finding elements is more generic and independent of underlying container (only dependent on iterators). However operation of erase depends on how container is storing memory (for example, you might have linked list instead of dynamic array). So STL expects containers to do its own erase while providing generic "remove" operation so all containers don't have to implement that code. In my view, the name is very misleading and std::remove should have been called std::find_move.

Note: Above code is strictly pseudocode. The actual STL implementation is more smarter, for example, using std::move instead of copy.

Phaeton
  • 15
  • 4
Shital Shah
  • 63,284
  • 17
  • 238
  • 185
17

std::remove does not remove the actual objects, rather, pushes them to the end of the container. Actual deletion and deallocation of memory is done via erase. So:

(1). Is there any use of std::remove other than using it with erase function.

Yes, it helps to get a pair of iterators to a new sequence without having worry about proper de-allocation etc.

(2). Even after doing std::remove, why a.size() returns 2 and not 1?

The container still holds to those objects, you only have a new set of iterators to work with. Hence the size is still what it used to be.

dirkgently
  • 108,024
  • 16
  • 131
  • 187
  • 6
    Hmm. Actually std::remove() does not move the deleted elements to the end of the container -- the remaining positions of the container will contain their original values. (It must work this way to preserve O(n) time with forward iterators.) – j_random_hacker Apr 28 '09 at 18:52
  • j_random_hacker, hmm actually it looks like after it did the work, the range after the returned iterator also must not contain any "removed" elements - which is an additional requirement i haven't yet considered - so those elements are not entirely indeterminate. So my answer were wrong and i deleted it – Johannes Schaub - litb Apr 28 '09 at 19:05
  • 1
    it says: "Eliminates all the elements referred to by iterator i in the range [first, last) for which the following corresponding conditions hold: *i == value" which i interpret as that that range does not contain any "removed" value anymore in the end. Interesting, i haven't known that. But i don't think that one can rely on those elements being the same as before. One certainly can't if that range would have contained a "removed" value, at least. So i think cplusplus.com is wrong again imo :D – Johannes Schaub - litb Apr 28 '09 at 19:12
  • 1
    Still reading the draft -- it gets more interesting with erase which *calls* the dtors. IMO, remove is only about swapping pointers, which is how iterators are implemented, more or less. – dirkgently Apr 28 '09 at 19:13
  • curiously enough, GCC seems to not remove any "removed" elements from the range [returned_iterator, last) so i'm not sure how one should interpret the wording? Any idea? – Johannes Schaub - litb Apr 28 '09 at 19:16
  • 3
    in the end, i think it's just poor wording and i shouldn't have deleted my answer. elements after returned_iterator just have some indeterminate value i think. sigh :( hehe – Johannes Schaub - litb Apr 28 '09 at 19:18
  • Note that remove() is not supposed to call the dtors, at least that's not required. This is an efficient implementation. – dirkgently Apr 28 '09 at 19:19
  • indeed, it just moves things around. but i think the value of things after the returned iterators are not required to be identical as they were before. forward iterators are multipass. It could just zero out any elements in [returned_iterator, end), even if it read those elements again before to compare them . As far as i can see, there is no requirement that forbids that. – Johannes Schaub - litb Apr 28 '09 at 19:24
  • I cannot disagree in principle from what I have found so far. Will update if I find anything concrete. – dirkgently Apr 28 '09 at 19:28
  • @litb: Intriguing, I'm certain that that wording is actually a bug in the standard! I believe it should refer to any iterator in the range [first, returned_value) instead of [first, last). On the one hand, it's only practical; but also, if the iterator range contained *only* values that are to be removed, there is *no way* that remove() can meet the guarantee as stated! :) – j_random_hacker Apr 28 '09 at 19:31
  • Also my initial comment was actually wrong -- the remaining elements (I believe) could contain anything at all, including the text "j_random_hacker RULZ!!!1!", suitably encoded. :) (My statement is true for implementations of std::vector that I've come across, but it might be more efficient to move deleted elements to the end for std::list -- sorry ceretullis, maybe there was a grain of truth to what you were saying.) – j_random_hacker Apr 28 '09 at 19:34
  • 1
    @litb: I don't think remove() could just "zero out" the remaining elements, as the expression "*i = 0" (where i is a forward iterator) is not required to be valid. – j_random_hacker Apr 28 '09 at 19:45
  • i'm sorry i mean just setting all the elements to some value, like the last one or so. :) – Johannes Schaub - litb Apr 28 '09 at 20:41
10

To remove element with some condition(equal some value or other condition like less than) in container like vector, it always combine function member function erase and std::remove or std::remove_if.

In vector, the function erase can just delete element by position, like:

iterator erase (iterator position);

iterator erase (iterator first, iterator last);

But if you want to erase elements with some condition, you can combine it with std::remove or std::remove_if.

For example, you want to erase all the elements 6 in the below vector:

std::vector<int> vec{6, 8, 10, 3, 4, 5, 6, 6, 6, 7, 8};
// std::remove move elements and return iterator for vector erase funtion
auto last = std::remove(vec.begin(), vec.end(), 6);
for(int a:vec)
    cout<<a<<" ";
cout<<endl;
// 8 10 3 4 5 7 8 6 6 7 8 

vec.erase(last, vec.end());
for(int a:vec)
    cout<<a<<" ";
cout<<endl;
// 8 10 3 4 5 7 8 

std::remove works as below, it does't erase any elements, it just move elements and returns the iterator.

enter image description here

Possible implementation:

template< class ForwardIt, class T >
ForwardIt remove(ForwardIt first, ForwardIt last, const T& value)
{
    first = std::find(first, last, value);
    if (first != last)
        for(ForwardIt i = first; ++i != last; )
            if (!(*i == value))
                *first++ = std::move(*i);
    return first;
}

Conclusion:

If you want to remove elements with some condition, you use vector::iterator erase (iterator first, iterator last); essentially.

First get range start:

auto last = std::remove(vec.begin(), vec.end(), equal_condition_value);

erase by range(always with end())

vec.erase(last, vec.end());

cited:

https://en.cppreference.com/w/cpp/algorithm/remove

Jayhello
  • 5,931
  • 3
  • 49
  • 56
  • 1
    This is the clearest and most comprehensive answer. Additionally it visually shows the iterators, including the one returned by `remove` *and* shows the relationship between `remove` and `erase`, iterator-wise. I didn't understand how things work until I've read this answer. – mireazma Mar 26 '21 at 17:57
8

i faced the same issue, trying to understand the difference. the explanations that have been give so far are right on the money, but i only understood them after seeing an example;

#include <algorithm>
#include <string>
#include <iostream>
#include <cctype>

int main()
{
    std::string str1 = "Text with some   spaces";
    std::string::iterator it = remove(str1.begin(), str1.end(), 't');
    std::cout << str1 << std::endl;// prints "Tex wih some   spaceses"
    for (str1.begin();it != str1.end(); ++it) 
    {
         std::cout << *it; //prints "es"
    }

}

as you can see, the remove, only moves the lower case 't' to the end of the string, while returning a new iterator to the end of the new string (new string is the old string up to where the removed element are inserted) this is why when you print the iterator that you got from "remove"

   "Text with some   spaces"
       ^   ^removes both 't', then shift all elements forward -1 //what we want to remove
   "Text with some   spaces"
                          ^ end of string                    -2 //original state of string
   "Tex with some   spacess"
                          ^end of string                     -3 //first 't' removed
   "Tex wih some   spaceses"
                          ^end of string                     -4 //second 't' removed
   "Tex wih some   spaceses"
                        ^new iterator that remove() returned -5 // the state of string after "remove" and without "erase"

if you pass the iterator you obtained from step 5 to "erase()" it will know to erase from there to the end of string re-sizing the string in process

Molegrammer
  • 193
  • 1
  • 9
6

Simplest I can come up with:

erase() is something you can do to an element in a container. Given an iterator/index into a container, erase( it ) removes the thing the iterator refers to from the container.

remove() is something you can do to a range, it re-arranges that range but doesn't erase anything from the range.

janisz
  • 6,292
  • 4
  • 37
  • 70
3

remove doesn't "really" remove anything, because it can't.

In order to "actually" remove the elements from container you need to access container APIs. Where as remove works only with iterators irrespective of what containers those iterators points to. Hence, even if remove wants an "actual remove", it can't.

Remove overwrite "removed" elements by the following elements that were not removed and then it is up to the caller to decide to use the returned new logical end instead of the original end.

In your case remove logically removed 1 from vector a but size remained to 2 itself. Erase actually deleted the elements from vector. [ from vector new end to old end ]

The main idea of remove is it cannot change the number of elements and it just remove elements from a range as per criteria.

aJ.
  • 34,624
  • 22
  • 86
  • 128