11

Possible Duplicate:
remove_if equivalent for std::map

I have a set of strings:

set <wstring> strings;
// ...

I wish to remove strings according to a predicate, e.g.:

std::remove_if ( strings.begin(), strings.end(), []( const wstring &s ) -> bool { return s == L"matching"; });

When I attempt this, I get the following compiler error:

c:\Program Files (x86)\Microsoft Visual Studio 10.0\VC\include\algorithm(1840): error C2678: binary '=' : no operator found which takes a left-hand operand of type 'const std::basic_string<_Elem,_Traits,_Ax>' 

The error appears to suggest that std::string doesn't have a by-value copy constructor ( which would be illegal). Is it somehow bad to use std::remove_if with std::set ? Should I be doing something else instead such as several iterations of set::find() followed by set::erase() ?

Community
  • 1
  • 1
Benj
  • 31,668
  • 17
  • 78
  • 127
  • Your simple sample can be replaced by `strings.erase(L"matching");`. One presumes that your *actual* predicate isn't as trivial. – Robᵩ Jun 21 '12 at 16:05
  • @Robᵩ Yes that was just an example, I do require a function object. – Benj Jun 21 '12 at 16:11

2 Answers2

21

std::remove_if (or std::erase) works by reassigning the values of the members of the range. It doesn't understand how std::set organizes data, or how to remove a node from its internal tree data structure. Indeed, it's impossible to do so using only references to nodes, without having the set object itself.

The standard algorithms are designed to have transparent (or at least consistently easy-to-remember) computational complexities. A function to selectively remove elements from a set would be O(N log N), due to the need to rebalance the tree, which is no better than a loop calling my_set.remove() . So, the standard doesn't provide it, and that is what you need to write.

On the other hand, a naively hand-coded loop to remove items from a vector one-by-one would be O(N^2), whereas std::remove_if is O(N). So the library does provide a tangible benefit in that case.

A typical loop (C++03 style):

for ( set_t::iterator i = my_set.begin(); i != my_set.end(); ) {
    if ( condition ) {
        my_set.erase( i ++ ); // strict C++03
        // i = my_set.erase( i ); // more modern, typically accepted as C++03
    } else {
        ++ i; // do not include ++ i inside for ( )
    }
}

Edit (4 years later!): i ++ looks suspicious there. What if erase invalidates i before the post-increment operator can update it? This is fine, though, because it's an overloaded operator++ rather than the built-in operator. The function safely updates i in-place and then returns a copy of its original value.

Potatoswatter
  • 134,909
  • 25
  • 265
  • 421
  • I wonder why they didn't generalise the notion of an "assignable iterator" (for want of a better term), given that there are serveral containers that exhibit this behaviour. – Rook Jun 21 '12 at 15:49
  • @Rook There is such a notion, but it's not a fix. The problem is that the `std::remove_if` is specified as being O(N), and an implementation that is not O(N) cannot be called `std::remove_if` by the letter of the law. You can provide your own implementation in your own namespace. The overload will conflict with the one in `std`, though. You're better off just writing a loop. – Potatoswatter Jun 21 '12 at 16:00
  • 1
    @Rook: In this case the problem is not the iterator, but the `value_type` of the container. `std::remove_if` *modifies* the values through dereferencing the iterator, but the `value_type` in a `std::set` is a constant object. – David Rodríguez - dribeas Jun 21 '12 at 16:34
  • `remove_if` should be called `shuffle_to_the_front_unless`: then it would be easy to see that it can't possibly be used on a set, and that it doesn't make structural changes to any container it's used over. Which, as you say, iterators never do. – Steve Jessop Jun 21 '12 at 16:35
  • Ahh, I carelessly assumed that the iterator was returning a constant reference. Having the value type be constant is kinda obvious in retrospect. – Rook Jun 21 '12 at 17:09
  • You can also do `i = my_set.erase(i);` like the [example here](http://en.cppreference.com/w/cpp/container/set/erase). – Timmmm Jan 29 '18 at 09:44
  • The code is fine, but the reason is wrong. The real reason is that in the expression `f(a)` the evaluation of `a` is sequenced before the invocation of the function. This has nothing to do with operator overloading. – L. F. Jul 08 '19 at 11:48
  • @L.F. The sequence point of `f(a)` is introduced by overloading and the built-in operation lacks a counterpart. – Potatoswatter Jul 08 '19 at 16:49
  • @Potatoswatter [\[expr.call\]/8]: All side effects of argument evaluations are sequenced before the function is entered (see [intro.execution]). (http://eel.is/c++draft/expr.call#8.note-1) If an operator function is invoked using operator notation, argument evaluation is sequenced as specified for the built-in operator; see [over.match.oper]. (http://eel.is/c++draft/expr.call#8.note-2) Am I missing something? – L. F. Jul 08 '19 at 23:39
  • @L.F. Again that is only talking about functions, not the absence of any function. It doesn’t say that built-in operators get sequenced like functions. Only the other way around. Also, note that this answer is 10 years old. Things might have changed and they might change still. – Potatoswatter Jul 09 '19 at 05:21
10

The error message says

no operator found which takes a left-hand operand of type 'const std::basic_string<_Elem,_Traits,_Ax>'

Note the const. The compiler is correct that std::wstring doesn't have an operator= which can be called on a const object.

Why is the string const? The answer is that the values in a std::set are immutable, because values in a set are ordered, and changing a value could change its ordering in the set, invalidating the set.

Why is the compiler trying to copy a value of the set?

std::remove_if (and std::remove) don't actually erase anything (nor can they, because they don't have the container, only iterators). What they do is to copy all values in the range which don't match the criterion to the beginning of the range, and return an iterator to the next element after the matching elements. You are then supposed to manually erase from the returned iterator to the end of the range. Since a set keeps its elements in order, it would be wrong to move any elements around, so remove_if cannot be used on a set (or any other associative container).

In short, you do have to use a loop of std::find_if and set::erase, like so:

template<class V, class P>
void erase_if(std::set<V>& s, P p)
{
  std::set<V>::iterator e = s.begin();
  for (;;)
  {
    e = std::find_if(e, s.end(), p);
    if (e == s.end())
      break;
    e = s.erase(e);
  }
}
ymett
  • 2,425
  • 14
  • 22
  • It would actually be possible for a library to provide `std::remove_if` compatible with `std::set`, by using the special knowledge that the root node actually lives inside the container object. However, it would not meet the O(N) runtime requirement. – Potatoswatter Jun 21 '12 at 16:04
  • Oops, the `end()` node, not the root, but you get the idea. – Potatoswatter Jun 21 '12 at 16:10
  • 1
    +1, you spelled the reasoning out very well and provided a nice alternative, but I would probably name that function `erase_if`. – Benjamin Lindley Jun 21 '12 at 16:15
  • @Benjamin Lindley Good point, changed. – ymett Jun 25 '12 at 12:42
  • @Potatoswatter It would also be inconsistent - its behaviour would be very different to `remove_if` on other containers. – ymett Jun 25 '12 at 12:46
  • @ymett No, the only difference would be the runtime complexity. `remove_if` is a stable algo so the elements wouldn't be reordered. It would also not assign values of sequence elements, but instead change the sequence itself, but that aspect isn't specified at all by the standard. – Potatoswatter Jun 26 '12 at 05:20