5

Does STL already contain any simple method or algorithm for storing the difference between to sets set1 and set2 directly in set1, without the need for a temporary set variable?

The sample code below shows some alternatives that I already have tried (which did not work) and the solution with a temporary set tmp (which I want to avoid):

int _tmain(int argc, _TCHAR* argv[])
{
    std::set<int> set1, set2;

    set1.insert(1); set1.insert(2); set1.insert(3); set1.insert(4); set1.insert(5);
    set2.insert(4); set2.insert(6);

    // NONE OF THE FOLLOWING ALTERNATIVES DID WORK:
    // a: // set1.erase(set2.end(), set2.begin());
    // b: // std::set_difference(set1.begin(), set1.end(), 
    //                set2.begin(), set2.end(), set1.begin());
    // c: // std::remove_if(set1.begin(), set1.end(), 
    //                [set2](int i){return set2.find(i) != set2.end();} );

    // Complicated version, for which I am trying to find something simpler:
    std::set<int> tmp;
    std::set_difference(set1.begin(), set1.end(), set2.begin(), set2.end(), std::inserter(tmp, tmp.end()));
    set1.clear();
    std::copy(tmp.begin(), tmp.end(), std::inserter(set1, set1.end()));

    // Print result: // Expect 1 2 3 5
    std::cout << "set1: ";
    for (auto it=set1.begin(); it != set1.end(); it++)
    {
        std::cout << *it << " ";
    }
    std::cout << std::endl;

    return 0;
}

I am looking for a solution that does not require C++11 (except the few C++11 constructs allowed in Visual Studio 2010).

  • @AbhishekBansal is was just writing that now – user3125280 Dec 25 '13 at 12:32
  • 1
    You don't want C++11, but the code does use C++11, and the things you tried also use C++11... –  Dec 25 '13 at 12:34
  • @Abhishek Bansal: remove_if does not work with sets. –  Dec 25 '13 at 12:40
  • now i'm just confused, what are we trying to get? symmetric difference or the other one? – user3125280 Dec 25 '13 at 12:44
  • @fmunkert Sorry did not know that. Deleted. – Abhishek Bansal Dec 25 '13 at 12:46
  • @fmunkert wait wait wait. why can't remove_if be used? because of its iterator type being const in c++11? – user3125280 Dec 25 '13 at 12:48
  • instead of `std::copy` you may use `std::swap(set1, tmp);` – Jarod42 Dec 25 '13 at 12:49
  • @Jarod42 set difference is the offending copier – user3125280 Dec 25 '13 at 12:49
  • Note that you can replace the `set1.clear(); std::copy(tmp.begin(), tmp.end(), std::inserter(set1, set1.end()));` lines with `set1=tmp;` – Sergey Kalinichenko Dec 25 '13 at 12:57
  • @user3125280: Quoted from http://stackoverflow.com/questions/2088495/how-to-remove-all-even-integers-from-setint-in-c: remove_if requires that operator* returns an non-const lvalue. std::set enforces that it's always ordered; returning a non-const lvalue from std::set::operator* would break that guarantee. Therefore std::remove_if() does not take std::set::iterators –  Dec 25 '13 at 13:13
  • @fmunkert exactly the same thing - a constant bidirectional iterator like set::iterator (in c++11) is an iterator that can only be dereferenced to a constant. They just changed the way this is enforced in c++11 – user3125280 Dec 25 '13 at 13:19
  • actually the set::erase signature has also been changed to return the next iterator, to make the workaround in that answer irrelevant (you would use it = erase(it) instead of said answer's solution) – user3125280 Dec 25 '13 at 13:22

3 Answers3

2

This seems easy enough to write manually:

std::set<int>::iterator iter1 = set1.begin();
std::set<int>::iterator end1 = set1.end();

std::set<int>::const_iterator iter2 = set2.begin();
std::set<int>::const_iterator end2 = set2.end();

while (iter1 != end1 && iter2 != end2) {
  if (*iter1 < *iter2) {
    ++iter1;
  } else if (*iter2 < *iter1) {
    ++iter2;
  } else {
    set1.erase(iter1++);
    ++iter2;
  }
}

You can put this in a generic reusable function.

  • with `auto` and `std::begin` and `std::end` it will probably look even better – user2485710 Dec 25 '13 at 12:50
  • 1
    @user2485710 Agreed, but those are C++11 features, and I don't know which C++11 features work (and to what extent they work) in VS2010. –  Dec 25 '13 at 12:51
  • I would says that `set1.erase(iter1++)` is UB... as `erase` may be call before incrementation. – Jarod42 Dec 25 '13 at 12:59
  • @Jarod42 There's a sequence point before a function is called, all side effects must have taken place at that point. –  Dec 25 '13 at 13:01
  • there are some possible problems i think - but you don't have to worry because c++11 to the rescue - just use `iter1 = set1.erase(iter1)` - theoretically handles this situation without incrementing dubiously (erase might invalidate stuff too) – user3125280 Dec 25 '13 at 13:28
  • @hvd: thanks, just found http://stackoverflow.com/questions/19546124/why-isnt-myset-eraseit-undefined-behavior-or-is-it – Jarod42 Dec 25 '13 at 13:30
  • @Jarod42 this has since changed though – user3125280 Dec 25 '13 at 13:32
  • @user3125280: you may use the return value, but provided answer is correct (but I prefer to be consistency and using the C++11 one, that the OP can't). – Jarod42 Dec 25 '13 at 13:36
  • @user3125280 For general containers, you're right, `erase` may invalidate other iterators too. `set`, however, has a stronger guarantee. `set.erase` is guaranteed to only invalidate iterators to the erased element, and `iter1` will already no longer point to the erased element. This is still guaranteed in C++11, see [associative.reqmts]p9. –  Dec 25 '13 at 13:38
1

You can avoid a temporary with a simple functor class that does the removing, like this:

class remover {
    std::set<int> &s;
public:
    remover(std::set<int>& theSet) : s(theSet) {}
    void operator()(int val) { s.erase(val); }
};

With this class in hand, you can code the removal as follows:

for_each(set2.begin(), set2.end(), remover(set1));

The idea is similar to your solution #c based on lambdas, from the commented out list of things that did not work.

Demo on ideone.

Sergey Kalinichenko
  • 714,442
  • 84
  • 1,110
  • 1,523
  • it can't be done in a single function call, and it can't use lambdas, so a functor is the best approach, imo – user3125280 Dec 25 '13 at 12:54
  • 1
    This requires multiple searches in `set1`, because you're using `erase` with a value instead of an iterator. I'm aware that performance likely won't be critical, but considering how easy it is to write it in a way that avoids that problem, I wouldn't write it like this. –  Dec 25 '13 at 12:57
  • @hvd this has the lowest time complexity of m + k log n, where n is size of s1, m is size of s2, k is the size of their intersection (it's multiple searches in one or the other) – user3125280 Dec 25 '13 at 13:00
  • @hvd I agree, there's some performance implications in this. However, OP has an efficient, working solution already (I'd make `tmp` a `vector` to avoid the `log(N)` on insertions to optimize, though) so I viewed this as an exercise in minimizing lines of code. Of course I could be wrong on this, and OP might have a different idea of what he would like to minimize. – Sergey Kalinichenko Dec 25 '13 at 13:03
  • @dasblinkenlight what performance implications? it avoids the copies necessitated by set::difference, and is algorithmically fast since lookup by val is log n (tree traversal though potentially constant if unordered_set were used (c++11)) also you can't delete via a list of iterators since the may invalidate other iterators (hence their const'ness and remove_if's non-const requirement) – user3125280 Dec 25 '13 at 13:10
  • @user3125280 See hvd's comment. – Sergey Kalinichenko Dec 25 '13 at 13:18
  • @user3125280 See my answer, there's no need for multiple searches in either, you can use a single loop that tackles iteration over both sets (meaning linear complexity). I do agree with dasblinkenlight that this answer does have its merits, it's valid, it's clear, it's not too slow, and if set2 is usually significantly smaller than set1, it may be very fast. –  Dec 25 '13 at 13:18
  • @hvc i forgot the defining property of sets. they are ordered. whoops – user3125280 Dec 25 '13 at 13:25
  • Instead of `remover(set1)` you also can use `[&set1](int i){set1.erase(i); }`, because lambdas are supported in Visual Studio 2010. This makes the anwser a one-liner using only STL methods and algorithms, therefore exactly answering my question. Performance probably is not optimal, however. –  Dec 25 '13 at 16:48
0

I'm looking for a better solution, but as a start you can write this with a little modification and remove that unnecessary use of std::copy and clear

#include <iostream>
#include <set>
#include <algorithm>

int main(int argc, char *argv[]) {
  std::set<int> set1, set2;

  set1.insert(1);
  set1.insert(2);
  set1.insert(3);
  set1.insert(4);
  set1.insert(5);

  set2.insert(4);
  set2.insert(6);
  {
    std::set<int> tmp;
    std::set_difference(set1.begin(), set1.end(), set2.begin(), set2.end(),
                        std::inserter(tmp, tmp.end()));

    set1 = std::set<int>(tmp.begin(), tmp.end());
  }
  std::cout << "set1: ";
  for (auto it = set1.begin(); it != set1.end(); it++) {
    std::cout << *it << " ";
  }
  std::cout << std::endl;

  return 0;
}

the scope is also equal to "RAII for free" in C++ ...

user2485710
  • 9,451
  • 13
  • 58
  • 102