4

Is there a way to delete duplicate elements from vector container containing string elements while maintaining order.

Till now I have used set method, but it doesn't retain the order.

I don't know how to use remove_if with respect to this problem.

Quoros
  • 184
  • 1
  • 10
  • 1
    If the container has an order (That is, its elements are sortered) duplicates are contiguous. So, where is the problem? If you remove a duplicate, the order is not modified. – Manu343726 Aug 03 '13 at 14:19
  • 5
    @Manu343726: "has an order" does not mean "is sorted". – Benjamin Lindley Aug 03 '13 at 14:21
  • Do you want to only remove consecutive repeated values (like the Unix command `uniq`), or also later reprtitions? That is, if your original vector looks like `{ "apples", "apples", "oranges", "apples", "grapes" }`, should the result be `{ "apples", "oranges", "apples", "grapes" }` or `{ "apples", "oranges", "grapes" }`? – celtschk Aug 03 '13 at 19:57

5 Answers5

6

How about using a temporary container:

std::vector< int >::iterator i , j ;
std::set< int > t_set;
for( i = v.begin() , j = v.begin() ; i != v.end() ; ++i )
    if( t_set.insert( *i ).second) 
        *j++ = *i ;
v.erase( j , v.end() ); 

Using std::remove_if, I can think of this:

std::set<int> t_set;
std::vector<int> res; //Resultant vector

remove_copy_if(v.begin(), v.end(), std::back_inserter(res), 
    [&t_set](int x){ 
        return !t_set.insert(x).second; 
    } );
P0W
  • 46,614
  • 9
  • 72
  • 119
  • 1
    +1, I've posted almost identical solution it seams, but much later than you did. This is fastest (O(n log n)) of doing this. – Leonid Volnitsky Aug 03 '13 at 16:30
  • If the objects are expensive to copy (the question speaks about strings; it doesn't say how long those are), you could also store the iterators into the original vector in the set, using a custom comparison functor which dereferences the iterators. Also, you might use `swap` instead of assignment if those objects have cheap `swap`. In C++11, move assignment would be the obvious choice. – celtschk Aug 03 '13 at 20:09
  • It is giving this error while using std::remove_if - expected primary-expression before '[' token – Quoros Aug 04 '13 at 03:49
  • Temporary container solution worked expeditiously. – Quoros Aug 04 '13 at 04:00
  • @MayankLal Did you use `-std=c++0x` option for compilation of `std::remove_if` version ? – P0W Aug 04 '13 at 05:01
  • @LeonidVolnitsky: Not quite. It's O(n log n), but it's not really fast. See my comment on the question, it has a link to an answer I wrote that's faster. – user541686 Aug 04 '13 at 06:25
  • @Mehrdad: You answer to similar question is identical to first POW's version (and mine too), as far as I can see. Can you tell me what difference is? – Leonid Volnitsky Aug 04 '13 at 10:48
  • @LeonidVolnitsky: Huh? It's completely different. Your answers use `std::set`, mine uses `std::sort`. – user541686 Aug 04 '13 at 17:51
  • @Mehrdad: Ops, sorry, was looking in the wrong place. I like sorted arrays too. But, I can see two cases where your code will be slower. First if, most of elements are duplicates and will be removed. And second case, when elements are very big. – Leonid Volnitsky Aug 04 '13 at 18:57
  • @LeonidVolnitsky: Yes, it will be asymptotically slower if most elements are duplicates, but I'm not sure if it will be empirically slower. But I don't see why the element size matters at all there, could you explain? – user541686 Aug 04 '13 at 18:59
  • @Mehrdad: Look at these [benchmarks](http://www.baptiste-wicht.com/2012/12/cpp-benchmark-vector-list-deque/). In these test we can take `std::list` as aproximation for `std::map`. Test "random_insert - 4096 byte" will explain what I am talking about. – Leonid Volnitsky Aug 04 '13 at 19:10
  • @LeonidVolnitsky: Something tells me you're taking *guesses* at what I've written instead of actually *reading* it. My code **never** copies the elements into a `vector` (or any other container), so the size of each element is totally irrelevant. Did you actually *read* my code? – user541686 Aug 04 '13 at 19:48
  • @P0W Any way around using -std=c++0x, I am unable to find that compilation option in codeblocks. – Quoros Aug 05 '13 at 07:45
  • @MayankLal that's compiler flag, sorry I'm familiar with CodeBlocks, check the Properties/Options for compiler in the Idle. – P0W Aug 05 '13 at 08:01
2

You could create an empty array, then iterate over the original vector and only copy over the first instance of each item in the vector. You could keep track of whether or not you've seen an item yet in the vector by adding it to a set and checking for an items presence in the set before adding it to the new array.

Madison May
  • 2,723
  • 3
  • 22
  • 32
1

You could do this:

 std::vector<int> v = { 1, 2, 2, 3, 4, 5, 6, 7, 8, 9, 8 };
 // 1 2 2 3 4 5 6 7 8 9 8

 for(size_t i=0;i<v.size();i++)
 {
     for(size_t j=0;j<v.size();j++)
     {
         if(v[i] == v[j] && i != j)
         {
              v.erase(v.begin()+j);
              j--; // Fix for certain datasets ie: 
         }         //                             1 2 1 1
     }   
 }

 // Produces:
 // 1 2 3 4 5 6 7 8 9
gifnoc-gkp
  • 1,506
  • 1
  • 8
  • 17
1

A simple solution

std::vector<int>::iterator it;
it = std::unique (myvector.begin(), myvector.end());

This iterator will point to the element next to the last element. You may not use this iterator if not required.

See THIS for further reference

EDIT:

As I thought that the vector would be sorted, the new solution could be:

    vector<int> vec= {5,1,2,3,5,4,2,1,1,4,3,2,4,5,2,1,3,5,2,3,5,2,3,2,3,5,2,1,3};
    set<int> s; 
    vector<int>::iterator vecIter=vec.begin();
    vector<int>::iterator vecIterCopy;
    for(;vecIter!=vec.end(); vecIter++) 
    {
        if(s.find(*vecIter)==s.end()) 
        {
            s.insert(*vecIter);
            *vecIterCopy++ = *vecIter;
        }
    }
Saksham
  • 9,037
  • 7
  • 45
  • 73
1

O(n*log(n)) solution:

vector<string> V={"aa","bb","aa","cc","cc"};
set<string> S; 

auto i=V.begin();
auto j=i;

for(; i!=V.end(); ++i) {
    if(S.insert(*i).second  &&  i!=j++) 
        *j = std::move(*i);
}

V.erase(j,V.end());

Also modified POW's version with std::remove_copy_if. But here without extraneous temporary:

set<string> S;
V.erase(
    copy_if(
        make_move_iterator(V.begin()),
        make_move_iterator(V.end()),
        V.begin(),
        [&](const string& x){ return S.insert(x).second;}
    ),
    V.end()
);
Leonid Volnitsky
  • 8,854
  • 5
  • 38
  • 53