112

I'm heavily using std::set<int> and often I simply need to check if such a set contains a number or not.

I'd find it natural to write:

if (myset.contains(number))
   ...

But because of the lack of a contains member, I need to write the cumbersome:

if (myset.find(number) != myset.end())
  ..

or the not as obvious:

if (myset.count(element) > 0) 
  ..

Is there a reason for this design decision ?

Jabberwocky
  • 48,281
  • 17
  • 65
  • 115
  • 7
    Most of the standard library works with iterators so normally functions returning iterators is what you would expect. Not to hard though to write a function to abstract that away. Most likely the compiler will inline it since it should only be a line or 2 of code and you will get the same performance. – NathanOliver Mar 01 '17 at 13:07
  • 1
    See proposal [Checking for Existence of an Element in Associative Containers](http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2016/p0458r0.html) not sure what the status is though. – Shafik Yaghmour Mar 02 '17 at 06:55
  • 4
    Another (more fundamental) problem with the `count()` approach is that it does more work than a `countains()` would have to do. – Leo Heinsaar Mar 02 '17 at 09:44
  • 11
    The **fundamental reason** behind that design decision is that `contains()` which returns a `bool` would **lose valuable information about where the element is in the collection**. `find()` preserves and returns that information in the form of an iterator, therefore is a better choice for a generic library like STL. (That's not to say that a `bool contains()` isn't a very nice-to-have or even necessary, though.) – Leo Heinsaar Mar 02 '17 at 09:56
  • 3
    It's easy to write a `contains(set, element)` free function using the public interface of the set. Therefore, the set's interface is functionally complete; adding a convenience method just increases the interface without enabling any additional function, which isn't the C++ way. – Toby Speight Mar 02 '17 at 10:55
  • 1
    @TobySpeight I wouldn't say about "not the C++ way": e.g. `std::string` has `push_back(CharT)`, which is basically the same as `operator+=(CharT)`; it has `length` in addition to `size`, which also increases the interface without enabling any additional function, etc.. – Ruslan Mar 02 '17 at 18:39
  • 3
    Are we closing everything these days? How is this question "Primarily opinion based" in any way? – Mr. Alien Mar 03 '17 at 04:16
  • @Mr.Alien How isn't it? I can think of half a dozen of such nice-to-have functions, like `set::has_more_than(n)` or `multiset::erase_once(x)`. Is there any reason we don't have these that doesn't boil down to an opinion? – Dmitry Grigoryev Mar 03 '17 at 09:42
  • @Mr.Alien How is it not? I mean the accepted answer starts with "I think it was probably because" and follows up with a personal theory (not to mention a perfectly reasonable "they may have" alternative in a comment), which is a text book example of opinion, and that's the best you can get here. The rest are similar. Now if you can cite some reference to actual design rationale from a WG21 committee meeting, sure, but none of the answers here, possibly aside from Leo's, are anything more than reasonable but arbitrary guesses. – Jason C Mar 03 '17 at 15:11
  • 2
    Guys, really with the reopens. Of the 7 remaining answers below, 6 are opinions. This question is a case study in what happens when you don't close POB fast enough. The danger here is readers thinking the top answer is correct. At the bare minimum the OP should accept the [only correct answer here](http://stackoverflow.com/a/42551739/616460). Part of me just wants to delete this to protect the world from misinformation. – Jason C Mar 04 '17 at 18:58
  • 2
    I'm a little surprised no one has mentioned this yet (that I can see), but it appears this question will be OBE by C++20: https://en.cppreference.com/w/cpp/container/set/contains – user3288829 Sep 10 '18 at 20:03
  • 1
    `set::contains()` was added to C++20 so the question is no longer relevant. It's amusing in retrospect to look at all the responses about how omitting `contains` is the proper decision and the C++ way. – Ken Shirriff Aug 14 '23 at 17:32

11 Answers11

159

I think it was probably because they were trying to make std::set and std::multiset as similar as possible. (And obviously count has a perfectly sensible meaning for std::multiset.)

Personally I think this was a mistake.

It doesn't look quite so bad if you pretend that count is just a misspelling of contains and write the test as:

if (myset.count(element)) 
   ...

It's still a shame though.

  • That sounds reasonable. – Jabberwocky Mar 01 '17 at 13:17
  • 6
    Incidentally, it's exactly the same with maps and multimaps (which is just as ugly, but less ugly than all these comparisons with `.end()`). – Matteo Italia Mar 01 '17 at 13:17
  • Not as similar as possible, but they follow the same concept. – Slava Mar 01 '17 at 15:19
  • @Slava - where do they differ where they could have been the same? If it was "same concept", they would have used `contains` vs `count`. – Martin Bonner supports Monica Mar 01 '17 at 16:48
  • @MartinBonner "they would have used contains vs count". No they would not. That concept also includes multimap and multiset and `contains` does not make sense for them, but `count` does for all of them. – Slava Mar 01 '17 at 18:29
  • This seems a reasonable interpretation of what the standard count. – Yakk - Adam Nevraumont Mar 01 '17 at 18:47
  • I would like to have something as a feature - to make something like a class-scope alias for a class member. Having the possibility to write #define func_a func_b - which defines a name in the global scope, why not be able to defile per-class aliases - something like that: alias std::set::contains = std::set::count. This will be also useful for renaming std::pair members: alias std::pair::name1 = std::pair::first – Nuclear Mar 01 '17 at 18:57
  • 8
    Alternatively, they may have seen no need for an additional member `contains()`, on the grounds that it would be redundant because for any `std::set s` and `T t`, the result of `s.contains(t)` is exactly identical to the result of `static_cast(s.count(t))`. As using the value in a conditional expression would implicitly cast it to `bool`, they may have felt that `count()` served the purpose well enough. – Justin Time - Reinstate Monica Mar 01 '17 at 20:57
  • 2
    That said, there are situations when having a function that returns the result as a `bool` would be useful, and the name `contains` is slightly clearer than the name `count`, which would allow code to be parsed faster by people that aren't familiar with `set`. – Justin Time - Reinstate Monica Mar 01 '17 at 21:00
  • @JustinTime People who are not familiar with the C++ standard library should not be programming in C++. – JAB Mar 01 '17 at 22:35
  • 3
    Misspelling? `if (myset.ICanHaz(element)) ...` :D – Stéphane Gourichon Mar 02 '17 at 13:35
  • `.count` might not be the very best option for `std::multiset`, because it can degrade to linear complexity if all elements of multiset are equal. – Ixanezis Mar 03 '17 at 10:05
  • 1
    Do you have any sources for the claim that the intent was to keep `set` and `multiset` similar? This answer, while highly voted and accepted, seems to be more of a personal theory, and doesn't agree with [the actual reason given from an actual proposal group discussion on this topic](http://stackoverflow.com/a/42551739/616460). It is arguably completely incorrect. – Jason C Mar 03 '17 at 15:19
  • 1
    The only source I have is the fact that both `set` and `multiset` (and their map counterparts) model *AssociateContainer* which doesn't have a `contains` (but obviously does have a `count`). This also seems to fit with the design philosophy of the whole STL. I observe that the google groups discussion dates from 2014 - I am referring to what happened 25 years earlier (I haven't got to the end of the thread though). – Martin Bonner supports Monica Mar 04 '17 at 09:21
  • 2
    Right. I have read through the entire thread now, and a) it is from late 2014 and early 2015 (not the original standardization effort); b) it seems to entirely concentrate on `contains` as a synonym for `find != end` rather than `count != 0`, and the idea that if you provide `contains` people will do `if (cont.contains(val)) { use( cont[val]); }` which involves a whole *two* look-ups, so is less efficient. That latter argument doesn't apply to sets, and actually I don't care - I'll happily trade a factor of two in performance for more readable code. – Martin Bonner supports Monica Mar 04 '17 at 09:51
  • 3
    @MartinBonner It doesn't really matter if the reasons for leaving it out were dumb. It also doesn't really matter if the conversation wasn't the 100% final rationale. Your answer here is just a reasonable guess on how you think it *should* be. The conversation, and answer from somebody not only involved in it, but tasked with proposing it (even though they didn't) is unarguably closer to the truth than this guess, no matter how you look at it. At the bare minimum you should at *least* mention it in this answer, that would be a great improvement and would be the responsible thing to do. – Jason C Mar 04 '17 at 19:02
  • 2
    @JasonC: Could you go ahead and edit a section in at the bottom please? I don't really understand the point you are trying to make, and comments probably aren't the best way to clarify that. Thanks! – Martin Bonner supports Monica Mar 04 '17 at 19:27
  • 2
    @JustinTime Although conceptually casting `count` to boolean is equivalent to `contains`, `count` can't return as soon as it finds the first match, as `contains` could, it has to keep searching the entire collection. I suspect the designers expect programmers who care about this performance issue to use `.find() != .end()` rather than `.count()`; it's too bad `.end()` isn't falsey. – Barmar Mar 07 '17 at 19:06
  • @Barmar As this is all a discussion about std::set, that's not right. std::set::count isn't going to iterate through all members are count the matches - it will do an O(1) lookup and return 0 or 1. – Martin Bonner supports Monica Mar 30 '23 at 08:29
  • @MartinBonnersupportsMonica Good point. I may have been thinking that it was just using a general inherited method, but obviously it would override it with a more efficient set-specific implementation. I can't remember what was in my mind when I wrote that 6 years ago. – Barmar Mar 30 '23 at 14:39
52

To be able to write if (s.contains()), contains() has to return a bool (or a type convertible to bool, which is another story), like binary_search does.

The fundamental reason behind the design decision not to do it this way is that contains() which returns a bool would lose valuable information about where the element is in the collection. find() preserves and returns that information in the form of an iterator, therefore is a better choice for a generic library like STL. This has always been the guiding principle for Alex Stepanov, as he has often explained (for example, here).

As to the count() approach in general, although it's often an okay workaround, the problem with it is that it does more work than a contains() would have to do.

That is not to say that a bool contains() isn't a very nice-to-have or even necessary. A while ago we had a long discussion about this very same issue in the ISO C++ Standard - Future Proposals group.

Leo Heinsaar
  • 3,887
  • 3
  • 15
  • 35
  • 10
    And it is interesting to note that that discussion ended with near consensus on it being desirable and **you** being asked to write a proposal for it. – PJTraill Mar 02 '17 at 20:54
  • @PJTraill True, and the reason I didn't go forward is that `contains()` would, obviously, interact strongly with existing containers and algorithms, which would be strongly impacted by concepts and ranges — at the time expected to come into C++17 — and I was convinced (as a result of the discussion as well as a couple of private email exchanges) that it was a better idea to wait for them first. Of course, in 2015 it wasn't clear that neither concepts nor ranges would not make it into C++17 (in fact, there were high hopes that they would). I'm not sure it's worth pursuing it now, though. – Leo Heinsaar Mar 03 '17 at 11:56
  • 1
    For `std::set` (which is what the question asks about), I don't see how `count` does more work than `contains` would have to do. The glibc implementation of `count` is (roughly) `return find(value) == end() ? 0 : 1;`. Apart from details about the ternary operator vs just returning `!= end()` (which I would expect the optimizer to remove), I can't see how there is any more work. – Martin Bonner supports Monica Mar 04 '17 at 14:11
  • 7
    " ... *contains() which returns a bool would lose valuable information about where the element is in the collection*" -- If the user calls `myset.contains()` (if it existed), that would be a pretty strong indication that that information is not valuable (to the user in that context). – Keith Thompson Mar 17 '17 at 21:17
  • 1
    Why does `count()` do more work than `contains()` would have to do for `std::set`? It's unique so `count()` could just be `return contains(x) ? 1 : 0;` which is exactly the same. – Timmmm Nov 21 '17 at 13:02
24

It lacks it because nobody added it. Nobody added it because the containers from the STL that the std library incorporated where designed to be minimal in interface. (Note that std::string did not come from the STL in the same way).

If you don't mind some strange syntax, you can fake it:

template<class K>
struct contains_t {
  K&& k;
  template<class C>
  friend bool operator->*( C&& c, contains_t&& ) {
    auto range = std::forward<C>(c).equal_range(std::forward<K>(k));
    return range.first != range.second;
    // faster than:
    // return std::forward<C>(c).count( std::forward<K>(k) ) != 0;
    // for multi-meows with lots of duplicates
  }
};
template<class K>
containts_t<K> contains( K&& k ) {
  return {std::forward<K>(k)};
}

use:

if (some_set->*contains(some_element)) {
}

Basically, you can write extension methods for most C++ std types using this technique.

It makes a lot more sense to just do this:

if (some_set.count(some_element)) {
}

but I am amused by the extension method method.

The really sad thing is that writing an efficient contains could be faster on a multimap or multiset, as they just have to find one element, while count has to find each of them and count them.

A multiset containing 1 billion copies of 7 (you know, in case you run out) can have a really slow .count(7), but could have a very fast contains(7).

With the above extension method, we could make it faster for this case by using lower_bound, comparing to end, and then comparing to the element. Doing that for an unordered meow as well as an ordered meow would require fancy SFINAE or container-specific overloads however.

Yakk - Adam Nevraumont
  • 262,606
  • 27
  • 330
  • 524
13

You are looking into particular case and not seeing bigger picture. As stated in documentation std::set meets requirement of AssociativeContainer concept. For that concept it does not make any sense to have contains method, as it is pretty much useless for std::multiset and std::multimap, but count works fine for all of them. Though method contains could be added as an alias for count for std::set, std::map and their hashed versions (like length for size() in std::string ), but looks like library creators did not see real need for it.

Slava
  • 43,454
  • 1
  • 47
  • 90
  • 8
    Note that `string` is a monster: it existed prior to the STL, where it had `length` and all those methods which are index-based, and then was "containerified" to fit within the STL model... without removing the existing methods for backward compatibility reasons. See [GotW #84: Monoliths Unstrung](http://www.gotw.ca/gotw/084.htm) => `string` really violates the "minimum amount of member functions" design principle. – Matthieu M. Mar 01 '17 at 16:02
  • 6
    But then the question becomes "Why is it worth having an AssociativeContainer concept like that?" - and I'm not sure it hindsight it was. – Martin Bonner supports Monica Mar 01 '17 at 16:50
  • 3
    @MartinBonner so you can write a generic function that accepts `AssociativeContainer`, not just `std::set`. When concepts would be added to the language (I hope at least) that would make even more sense. – Slava Mar 01 '17 at 18:32
  • 26
    Asking if a multiset, a multimap or map contains something makes perfect sense to me. In fact, `contains` is equal in effort on a set/map, but can be made *faster* than `count` on a multiset/multimap. – Yakk - Adam Nevraumont Mar 01 '17 at 18:48
  • 5
    AssociativeContainer doesn't require classes to *not* have a `contains` method. – user2357112 Mar 02 '17 at 00:51
  • @user2357112 but it requires to have `count` method, which `contains` would duplicate – Slava Mar 02 '17 at 01:26
  • 7
    @Slava That's like saying `size()` and `empty()` are duplicates, yet many containers have both. – Barry Mar 02 '17 at 16:07
  • @Barry I am not against adding `contains()` into `std::set` or `std::map`, I am just trying to explain reasoning lib creators had. And I am not one of them. – Slava Mar 02 '17 at 17:23
10

Although I don't know why std::set has no contains but count which only ever returns 0 or 1, you can write a templated contains helper function like this:

template<class Container, class T>
auto contains(const Container& v, const T& x)
-> decltype(v.find(x) != v.end())
{
    return v.find(x) != v.end();
}

And use it like this:

    if (contains(myset, element)) ...
rustyx
  • 80,671
  • 25
  • 200
  • 267
  • 3
    -1, because this is straightly contradicted by the fact that in fact the `contains` method does exist, it's just named in a stupid way. – Matteo Italia Mar 01 '17 at 14:33
  • 4
    "The STL strives to offer a minimal interface" *chough* `std::string` *cough* – bolov Mar 01 '17 at 14:56
  • @MatteoItalia better like this? – rustyx Mar 01 '17 at 15:14
  • 6
    @bolov: your point? `std.::string` is NOT part of the STL! It's part of the standard library and was retroactively templated... – MFH Mar 01 '17 at 15:22
  • @RustyX: downvote removed, but I would just delete this answer; given that `count` is already built-in in all the containers that provide `find`, this `contains` function is essentially redundant. – Matteo Italia Mar 01 '17 at 15:26
  • Why the fancy `auto` declaration when `bool` would be simpler and more readable? – Mark Ransom Mar 01 '17 at 15:30
  • 3
    @MatteoItalia `count` can be slower since it effectively needs to do two `find`s to get the beginning and the end of the range if the code is shared with `multiset`. – Mark Ransom Mar 01 '17 at 15:30
  • 2
    The OP already knows it is redundant, but apparently wants the code to explicitly read `contains`. I see nothing wrong with that. @MarkRansom the little SFINAE is to prevent this template from binding to things it shouldn't. – rustyx Mar 01 '17 at 15:35
  • @bolov Tries doesn't mean succeeds. – Yakk - Adam Nevraumont Mar 01 '17 at 18:49
  • @MarkRansom, The trailing return type ensures that this participates in overload resolution only when it will actually make sense (when `v.find(x) != v.end()` is valid for the types given). This way, it won't aggressively take over other possible overloads of `contains`. – chris Mar 02 '17 at 16:12
9

Since c++20,

bool contains( const Key& key ) const

is available.

Andy
  • 377
  • 6
  • 10
7

The true reason for set is a mystery for me, but one possible explanation for this same design in map could be to prevent people from writing inefficient code by accident:

if (myMap.contains("Meaning of universe"))
{
    myMap["Meaning of universe"] = 42;
}

Which would result in two map lookups.

Instead, you are forced to get an iterator. This gives you a mental hint that you should reuse the iterator:

auto position = myMap.find("Meaning of universe");
if (position != myMap.cend())
{
    position->second = 42;
}

which consumes only one map lookup.

When we realize that set and map are made from the same flesh, we can apply this principle also to set. That is, if we want to act on an item in the set only if it is present in the set, this design can prevent us from writing code as this:

struct Dog
{
    std::string name;
    void bark();
}

operator <(Dog left, Dog right)
{
    return left.name < right.name;
}

std::set<Dog> dogs;
...
if (dogs.contain("Husky"))
{
    dogs.find("Husky")->bark();
}

Of course all this is a mere speculation.

Martin Drozdik
  • 12,742
  • 22
  • 81
  • 146
2

I'd like to point out , as mentioned by Andy, that since C++20 the standard added the contains Member function for maps or set:

bool contains( const Key& key ) const;  (since C++20)

Now I'd like to focus my answer regarding performance vs readability. In term of performance if you compare the two versions:

#include <unordered_map>
#include <string>
using hash_map = std::unordered_map<std::string,std::string>;
hash_map a;

std::string get_cpp20(hash_map& x,std::string str)
{
    if(x.contains(str))
        return x.at(str);
    else
        return "";
};

std::string get_cpp17(hash_map& x,std::string str)
{
    if(const auto it = x.find(str); it !=x.end())
        return it->second;
    else
        return "";
};

You will find that the cpp20 version takes two calls to std::_Hash_find_last_result while the cpp17 takes only one call.

Now I find myself with many data structure with nested unordered_map. So you end up with something like this:

using my_nested_map = std::unordered_map<std::string,std::unordered_map<std::string,std::unordered_map<int,std::string>>>;

std::string get_cpp20_nested(my_nested_map& x,std::string level1,std::string level2,int level3)
{
    if(x.contains(level1) &&
        x.at(level1).contains(level2) &&
        x.at(level1).at(level2).contains(level3))

        return x.at(level1).at(level2).at(level3);
    else
        return "";
};

std::string get_cpp17_nested(my_nested_map& x,std::string level1,std::string level2,int level3)
{
    if(const auto it_level1=x.find(level1); it_level1!=x.end())
        if(const auto it_level2=it_level1->second.find(level2);it_level2!=it_level1->second.end())
            if(const auto it_level3=it_level2->second.find(level3);it_level3!=it_level2->second.end())
                return it_level3->second;

    return "";
};

Now if you have plenty of condition in-between these ifs, using the iterator really is painful, very error prone and unclear, I often find myself looking back at the definition of the map to understand what kind of object was at level 1 or level2, while with the cpp20 version , you see at(level1).at(level2).... and understand immediately what you are dealing with. So in term of code maintenance/review, contains is a very nice addition.

qwark
  • 493
  • 1
  • 4
  • 15
  • 4
    The usage of `contains` in `get_cpp20` is abusive. `contains` should of course be used only if you want to know if an element is in the map or not without actually retrieving the element. So using `contains` makes a lot of sense with sets, but much less with maps. – Jabberwocky May 25 '22 at 15:40
0

What about binary_search ?

 set <int> set1;
 set1.insert(10);
 set1.insert(40);
 set1.insert(30);
 if(std::binary_search(set1.begin(),set1.end(),30))
     bool found=true;
double-beep
  • 5,031
  • 17
  • 33
  • 41
0

contains() has to return a bool. Using C++ 20 compiler I get the following output for the code:

#include<iostream>
#include<map>
using namespace std;

int main()
{
    multimap<char,int>mulmap;
    mulmap.insert(make_pair('a', 1)); //multiple similar key
    mulmap.insert(make_pair('a', 2)); //multiple similar key
    mulmap.insert(make_pair('a', 3)); //multiple similar key
    mulmap.insert(make_pair('b', 3));
    mulmap.insert({'a',4});
    mulmap.insert(pair<char,int>('a', 4));
    
    cout<<mulmap.contains('c')<<endl;  //Output:0 as it doesn't exist
    cout<<mulmap.contains('b')<<endl;  //Output:1 as it exist
}
bashar
  • 21
  • 3
-1

Another reason is that it would give a programmer the false impression that std::set is a set in the math set theory sense. If they implement that, then many other questions would follow: if an std::set has contains() for a value, why doesn't it have it for another set? Where are union(), intersection() and other set operations and predicates?

The answer is, of course, that some of the set operations are already implemented as functions in (std::set_union() etc.) and other are as trivially implemented as contains(). Functions and function objects work better with math abstractions than object members, and they are not limited to the particular container type.

If one need to implement a full math-set functionality, he has not only a choice of underlying container, but also he has a choice of implementation details, e.g., would his theory_union() function work with immutable objects, better suited for functional programming, or would it modify its operands and save memory? Would it be implemented as function object from the start or it'd be better to implement is a C-function, and use std::function<> if needed?

As it is now, std::set is just a container, well-suited for the implementation of set in math sense, but it is nearly as far from being a theoretical set as std::vector from being a theoretical vector.

Mike Tyukanov
  • 579
  • 4
  • 10