5

Suppose I've a vector a = {"the", "of"} and a vector b = {"oranges", "the", "of", "apples"}.

I want to compare both vectors and remove the elements from a which are also in b. This is what I came up with:

for (int i = 0; i < a.size(); i++) {
    for (int j =0; j < b.size(); j++) {
       if (a[i] == b[j]) {
          a.erase(a.begin() + i);
       }
    }
}

But this loop is not removing the last element in a. Weird!

muqsitnawaz
  • 236
  • 2
  • 9

5 Answers5

9

The problem is that when you remove the first element of a the index gets incremented from 0 to 1. On the next iteration of the loop the size of the vector is 1 which meets the condition of the outer loop causing it to terminate. You can avoid any trickery that may be necessary to fix this by simply using std::remove_if, std::find, and a lambda.

#include <iostream>
#include <algorithm>
#include <vector>
#include <string>

int main()
{
    std::vector<std::string> a{ "the", "of" };
    std::vector<std::string> b{ "oranges", "the", "of", "apples" };

    auto pred = [&b](const std::string& key) ->bool
    {
        return std::find(b.begin(), b.end(), key) != b.end();
    };

    a.erase(std::remove_if(a.begin(), a.end(), pred), a.end());

    std::cout << a.size() << "\n";
}

A better test would be to switch the contents of a and b. This will remove "the" and "of" leaving you with "oranges" and "apples".

Captain Obvlious
  • 19,754
  • 5
  • 44
  • 74
  • It's somewhat worse than that; unless the inner loop was already on its last iteration, then the outer loop certainly _won't_ immediately "terminate" and suddenly you're accessing elements that may not even exist any more. – Lightness Races in Orbit Nov 30 '14 at 22:20
  • Yup, missed that. didn't even think about the inner loop having UB after if element erased is the last one in the container. I'll get that added after a couple more shots :) – Captain Obvlious Nov 30 '14 at 22:32
5

Try the following

#include <iostream>
#include <string>
#include <vector>
#include <algorithm>
#include <cassert>

int main()
{
    std::vector<std::string> a = { "the", "of" };
    std::vector<std::string> b = { "oranges", "the", "of", "apples" };

    for ( auto it = a.begin(); it != a.end(); )
    {
        if ( std::find( b.begin(), b.end(), *it ) != b.end() )
        {
            it = a.erase( it ); 
        }
        else
        {
            ++it;
        }
    }

    assert( a.empty() );
}

Of course it would be better if the vectors would be ordered.

Vlad from Moscow
  • 301,070
  • 26
  • 186
  • 335
3

In general, instead of walking the vector's content "manually" and selectively erase its items, I'd suggest using STL's already-built algorithms, combining them properly.

Using the Erase-Remove Idiom

In particular, to erase items satisfying some property from a std::vector, you can consider using the erase-remove idiom.
This Q&A on Stackoverflow discusses some options to erase items from STL containers, including the std::vector case.

You can find commented compilable code below, live here:

#include <algorithm>    // for std::remove_if()
#include <iostream>     // for std::cout, std::endl
#include <string>       // for std::string
#include <vector>       // for std::vector
using namespace std;

void print(const char* name, const vector<string>& v);

int main() 
{
    // Input vectors
    vector<string> a = {"the", "of"};
    vector<string> b = {"oranges", "the", "of", "apples"};

    print("a", a);
    print("b", b);

    // Use the erase-remove idiom
    a.erase(
        remove_if(
            a.begin(), 
            a.end(), 

            // This lambda returns true if current string 's'
            // (from vector 'a') is in vector 'b'. 
            [&b](const string& s) 
            {
                auto it = find(b.begin(), b.end(), s);
                return (it != b.end());
            }
        ), 

        a.end()
    );

    cout << "\nAfter removing:\n";
    print("a", a);
}


void print(const char* name, const vector<string>& v) 
{
    cout << name << " = {";
    bool first = true;
    for (const auto& s : v) 
    {
        if (first) 
        {
            first = false;
            cout << s;
        } 
        else 
        {
            cout << ", " << s;
        }
    }
    cout << "}" << endl;
}

Output:

a = {the, of}
b = {oranges, the, of, apples}

After removing:
a = {}

PS
Note also this very similar question on Stackoverflow.


Using std::set_difference()

An alternative approach can be to use std::set_difference(), e.g. something like the following code, live here.
(Note that in this case, as per set_difference() prerequisite, the input vectors must be already sorted.)

#include <algorithm>    // for std::set_difference(), std::sort()
#include <iostream>     // for std::cout, std::endl
#include <iterator>     // for std::inserter
#include <string>       // for std::string
#include <vector>       // for std::vector
using namespace std;

void print(const char* name, const vector<string>& v);

int main() 
{
    // Input vectors
    vector<string> a = {"the", "of"};
    vector<string> b = {"oranges", "the", "of", "apples"};

    print("a", a);
    print("b", b);

    // Sort the vectors before calling std::set_difference().
    sort(a.begin(), a.end());
    sort(b.begin(), b.end());

    // Resulting difference vector
    vector<string> c;
    set_difference(a.begin(), a.end(),
                   b.begin(), b.end(),
                   inserter(c, c.begin()));

    print("difference(a,b)", c);
}


void print(const char* name, const vector<string>& v) 
{
    cout << name << " = {";
    bool first = true;
    for (const auto& s : v) 
    {
        if (first) 
        {
            first = false;
            cout << s;
        } 
        else 
        {
            cout << ", " << s;
        }
    }
    cout << "}" << endl;
}
Community
  • 1
  • 1
Mr.C64
  • 41,637
  • 14
  • 86
  • 162
2

The issue you're having is due to the fact that you're removing elements from a as you're iterating over it, but not compensating for that. This is a common problem when trying to write a loop with erases in it.

If it doesn't matter what order the contents of your vectors are in, and you're okay with storing the result in another vector, one of the best approaches is to sort both vectors and call std::set_difference.

#include <algorithm>
#include <iterator>
#include <string>
#include <vector>

int main()
{
    std::vector<std::string> a = { "the", "of" };
    std::vector<std::string> b = { "oranges", "the", "of", "apples" };
    std::vector<std::string> res;

    std::sort(a.begin(), a.end());
    std::sort(b.begin(), b.end());

    std::set_difference(a.begin(), a.end(), b.begin(), b.end(),
        std::back_inserter(res));
}

res will contain all elements of a that weren't in b, which will be empty in this case.

If the order matters, or if it must be done in place, you can use the erase-remove idiom. It's worth nothing that this is likely to be slower for larger vectors, as it is inevitably an O(n^2) algorithm.

#include <algorithm>
#include <iterator>
#include <string>
#include <vector>

struct Pred
{
    const std::vector<std::string>& filter;
    Pred(const std::vector<std::string>& x)
        :filter(x){}

    bool operator()(const std::string& str) const
    {
        return std::find(filter.begin(), filter.end(), str) != filter.end();
    }
};

int main()
{
    std::vector<std::string> a = { "the", "of" };
    std::vector<std::string> b = { "oranges", "the", "of", "apples" };

    Pred pred(b);

    a.erase(std::remove_if(a.begin(), a.end(), pred), a.end());
}

If you don't happen to have access to a C++11 conformant compiler, the Pred structure should be a fairly good stand-in for a lambda. Otherwise, this lambda will do the job:

auto pred = [&b](const std::string& str)
    {
        return std::find(b.begin(), b.end(), str) != b.end();
    };
Jared Mulconry
  • 479
  • 3
  • 8
0

this is the correct syntax of erasing things form vector:

myvector.erase (myvector.begin()+5);

Secondly, after you erased it, your index of this vector would be invalid.

So I recommend you do a two-round scan. First round ,you mark the elements you want to remove. IN second round, you can erase them.

BTW your algorithm is O(n^2) time complexity. If you can, I recommend you sort your vector firstly. Then you can use much faster algorithm to deal with it.

BufBills
  • 8,005
  • 12
  • 48
  • 90
  • Well, yeah. That's what I intended to write here. My bad. But it just isn't working. – muqsitnawaz Nov 30 '14 at 21:06
  • 1
    It's true that `erase` invalidates your iterator, but `erase` also returns a new, valid iterator to the element that replaced the one you erased. The answer from Vlad from Moscow shows how this works. – David K Nov 30 '14 at 22:12