61

What is the C++ way of checking if an element is contained in an array/list, similar to what the in operator does in Python?

if x in arr:
    print "found"
else
    print "not found"

How does the time complexity of the C++ equivalent compare to Python's in operator?

juanchopanza
  • 223,364
  • 34
  • 402
  • 480
SanDash
  • 733
  • 1
  • 6
  • 15
  • 5
    Maybe `find` can do what you need - see http://en.cppreference.com/w/cpp/algorithm/find – Support Ukraine Jun 19 '17 at 05:44
  • 1
    @4386427 But you wouldn't want to use `std::find` on a map or set. – juanchopanza Jun 19 '17 at 05:46
  • 1
    Or, if the container is sorted, [std::lower_bound](http://en.cppreference.com/w/cpp/algorithm/lower_bound) can be used. (or the member function if available eg. [std::map::lower_bound](http://en.cppreference.com/w/cpp/container/map/lower_bound)). – Galik Jun 19 '17 at 05:47
  • 3
    So if you care about time complexity, unfortunately there is no syntactic equivalent that works well across all container types. – juanchopanza Jun 19 '17 at 05:49
  • @juanchopanza - Since I'm not a Python man, I'm not sure what the equivalent type for `arr` would be in C++. If it's like a `map`, then there is a `find` public member function in `std::map`. Anyway, this was just a hint to OP that `find` could be something to look into. If I knew it was "the answer" I would have posted it as an answer but not knowing Python very well, I just left a comment. – Support Ukraine Jun 19 '17 at 06:54
  • `contains(container)` isn't that difficult to write: https://codereview.stackexchange.com/questions/59997/contains-algorithm-for-stdvector – Deduplicator Jun 19 '17 at 20:31
  • @EdgarRokyan What is the point of having [tag:C++11] and [tag:C++14] for this question? – moooeeeep Jun 19 '17 at 20:58
  • @moooeeeep I've added them because a number of later answers are heavily based on features from C++11/14. – Edgar Rokjān Jun 19 '17 at 21:00
  • @EdgarRokyan That doesn't really matter. C++ is almost C++17, meaning it includes C++11 and C++14 features (except for things that have been "fixed" or deprecated.) – juanchopanza Jun 19 '17 at 21:05
  • @juanchopanza so C++ is a general tag for C++11/14/17 now, and C++11/14 tags are used only for specific features of appropriate language versions, right? – Edgar Rokjān Jun 19 '17 at 21:08
  • @EdgarRokyan I think this applies: [Should I tag questions C++11 if they refer to the C++11 standard but not new features?](https://meta.stackoverflow.com/q/271812/1025391) – moooeeeep Jun 19 '17 at 21:08
  • @moooeeeep yes, it exactly applies to the situation. – Edgar Rokjān Jun 19 '17 at 21:11
  • @EdgarRokyan C++ is current C++, C++11 could be used when one is restricted to a particular standard. Although if a standard is newly or abouyt to come out, it could make sense to tag answers with that particular standard too (e.g. c++17) – juanchopanza Jun 19 '17 at 21:11
  • @juanchopanza I feel I did some bad edits a couple of times :) – Edgar Rokjān Jun 19 '17 at 21:17
  • The "in" operator feature is being implemented in https://github.com/ploncomi/python_like_cpp – Patricio Loncomilla May 19 '21 at 15:36

7 Answers7

62

The time complexity of Python's in operator varies depending on the data structure it is actually called with. When you use it with a list, complexity is linear (as one would expect from an unsorted array without an index). When you use it to look up set membership or presence of a dictionary key complexity is constant on average (as one would expect from a hash table based implementation):

In C++ you can use std::find to determine whether or not an item is contained in a std::vector. Complexity is said to be linear (as one would expect from an unsorted array without an index). If you make sure the vector is sorted, you can also use std::binary_search to achieve the same in logarithmic time.

The associative containers provided by the standard library (std::set, std::unordered_set, std::map, ...) provide the member functions find() and count() and contains() (C++20) for this. These will perform better than linear search, i.e., logarithmic or constant time depending on whether you have picked the ordered or the unordered alternative. Which one of these functions to prefer largely depends on what you want to achieve with that info afterwards, but also a bit on personal preference. (Lookup the documentation for details and examples.)

If you want to, you can use some template magic to write a wrapper function that picks the correct method for the container at hand, e.g., as presented in this answer.

moooeeeep
  • 31,622
  • 22
  • 98
  • 187
  • 1
    _"The associative containers ... provide the `find()` method for this"_ - Does the `std::find` function have overloads that defer to these methods? – Eric Jun 19 '17 at 11:34
  • 3
    @Eric As the function takes Input-/Forward-Iterators, it seems to just forward-iterate the container. – moooeeeep Jun 19 '17 at 11:41
  • 2
    (To actually answer the question:) There are no such overloads of `std::find`. – moooeeeep Jun 19 '17 at 13:55
  • 1
    @Eric: Unfortunately, no. It's one of the reasons why the whole iterator-based interface of `` is, in hindsight, not a good idea, why ranges would be much better: with ranges you can specialize algorithms so that if the passed-in type has some properties/methods, you can defer to it. – Matthieu M. Jun 19 '17 at 17:02
  • @MatthieuM. The problem is not with the iterator-based interface; one can easily imagine a setup where `find` recognizes iterators from, say, an associative container, and the iterators contain (a pointer to) sufficient information to perform a binary search. The problem is that `find` is required to use `==`, which is not required to be consistent with any ordering the associative container might have (not even the default `<`). – T.C. Jun 19 '17 at 19:10
  • @T.C.: I would imagine that should `find` delegate to the container/view being iterated, this requirement would change. – Matthieu M. Jun 20 '17 at 06:26
18

You can approach this in two ways:

You can use std::find from <algorithm>:

auto it = std::find(container.begin(), container.end(), value);
if (it != container.end())
    return it;  

or you can iterate through every element in your containers with for ranged loops:

for(const auto& it : container)
{
    if(it == value)
        return it;
} 
Eric
  • 95,302
  • 53
  • 242
  • 374
Omar Martinez
  • 708
  • 6
  • 19
  • 3
    This wouldn't scale very well for sorted or unsorted sets or maps (O(N) vs. amortized O(log N) or O(1)) – juanchopanza Jun 19 '17 at 05:48
  • It's better to write for(const auto & it : container) to avoid unnessasary copy. Also the second approach, I think, should be on the fist place. Because it declares your intentions more clear – Andrey Nekrasov Jun 19 '17 at 06:05
  • @AndreyNekrasov I edited the order. In some cases you might just want a copy of the value, but i agree, in most cases you would want a `const reference`. – Omar Martinez Jun 19 '17 at 06:14
  • Bad idea, for the reasons specified in my first comment. – juanchopanza Jun 19 '17 at 07:56
  • 2
    @juanchopanza it's not exactly efficient in python either. (I believe it is O(n) on a list). map::find is logarithmic in C++ (unless http://www.cplusplus.com/reference/map/map/find/ is wrong) – Baldrickk Jun 19 '17 at 09:18
  • @Baldrickk: Does `std::find(map_iterator...)` forward to `std::map::find`, avoiding a linear search? – Eric Jun 19 '17 at 11:32
  • @Eric doesn't look like it, going by https://stackoverflow.com/a/17256035/2072269 – muru Jun 19 '17 at 11:38
  • @muru @Eric that's disappointing. I don't have any problems with using `std::map::find()`, but it's not exactly promoting abstraction and reusability. It makes you wonder why `std::find()` doesn't look to see if there is a member function `find()` and call that instead... – Baldrickk Jun 19 '17 at 12:08
  • @Baldrickk hint: is it possible to [get back the original container from an iterator](https://stackoverflow.com/a/11445040/2072269) (which is all that `std::find` sees)? – muru Jun 19 '17 at 12:11
  • yeah, I get that. it's a shame that there isn't an overload that takes a container object, calls the object's `map()` if it exists and iterates over the entire container otherwise.but then you run into https://stackoverflow.com/questions/14003627/stl-algorithms-why-no-additional-interface-for-containers-additional-to-iterat/14003766 I guess I have just been spoilt by Python recently. My C++ is getting rusty. – Baldrickk Jun 19 '17 at 12:22
  • Eric : How can std::find() forward to map find? STL algorithms know _NOTHING_ about the containers they are operating on. They only know about iterator category types. – SJHowe Jun 19 '17 at 13:34
  • 1
    @Baldrickk Obviously it is O(N) in a list. But not in a set or dict. That is the whole point. – juanchopanza Jun 19 '17 at 15:10
  • 4
    This is not a very good answer. In both snippets, what is the function we're returning from? What is the "else" case? That's even before you get into the question of using an O(N) search for those containers that have O(lg N) or O(1) searches available. -1. – Barry Jun 19 '17 at 16:24
14

Python does different things for in depending on what kind of container it is. In C++, you'd want the same mechanism. Rule of thumb for the standard containers is that if they provide a find(), it's going to be a better algorithm than std::find() (e.g. find() for std::unordered_map is O(1), but std::find() is always O(N)).

So we can write something to do that check ourselves. The most concise would be to take advantage of C++17's if constexpr and use something like Yakk's can_apply:

template <class C, class K>
using find_t = decltype(std::declval<C const&>().find(std::declval<K const&>()));

template <class Container, class Key>
bool in(Container const& c, Key const& key) {
    if constexpr (can_apply<find_t, Container, Key>{}) {
        // the specialized case
        return c.find(key) != c.end();
    } else {
        // the general case 
        using std::begin; using std::end;
        return std::find(begin(c), end(c), key) != end(c);
    }
}

In C++11, we can take advantage of expression SFINAE:

namespace details {
    // the specialized case
    template <class C, class K>
    auto in_impl(C const& c, K const& key, int )
            -> decltype(c.find(key), true) {
        return c.find(key) != c.end();
    }

    // the general case
    template <class C, class K>
    bool in_impl(C const& c, K const& key, ...) {
        using std::begin; using std::end;
        return std::find(begin(c), end(c), key) != end(c);
    }
}

template <class Container, class Key>
bool in(Container const& c, Key const& key) {
    return details::in_impl(c, key, 0);
}

Note that in both cases we have the using std::begin; using std::end; two-step in order to handle all the standard containers, raw arrays, and any use-provided/adapted containers.

Barry
  • 286,269
  • 29
  • 621
  • 977
10

This gives you an infix *in* operator:

namespace notstd {
  namespace ca_helper {
    template<template<class...>class, class, class...>
    struct can_apply:std::false_type{};
    template<class...>struct voider{using type=void;};
    template<class...Ts>using void_t=typename voider<Ts...>::type;

    template<template<class...>class Z, class...Ts>
    struct can_apply<Z,void_t<Z<Ts...>>, Ts...>:std::true_type{};
  }
  template<template<class...>class Z, class...Ts>
  using can_apply = ca_helper::can_apply<Z,void,Ts...>;

  namespace find_helper {
    template<class C, class T>
    using dot_find_r = decltype(std::declval<C>().find(std::declval<T>()));
    template<class C, class T>
    using can_dot_find = can_apply< dot_find_r, C, T >;
    template<class C, class T>
    constexpr std::enable_if_t<can_dot_find<C&, T>{},bool>
    find( C&& c, T&& t ) {
      using std::end;
      return c.find(std::forward<T>(t)) != end(c);
    }
    template<class C, class T>
    constexpr std::enable_if_t<!can_dot_find<C&, T>{},bool>
    find( C&& c, T&& t ) {
      using std::begin; using std::end;
      return std::find(begin(c), end(c), std::forward<T>(t)) != end(c);
    }
    template<class C, class T>
    constexpr bool finder( C&& c, T&& t ) {
      return find( std::forward<C>(c), std::forward<T>(t) );
    }
  }
  template<class C, class T>
  constexpr bool find( C&& c, T&& t ) {
    return find_helper::finder( std::forward<C>(c), std::forward<T>(t) );
  }
  struct finder_t {
    template<class C, class T>
    constexpr bool operator()(C&& c, T&& t)const {
      return find( std::forward<C>(c), std::forward<T>(t) );
    }
    constexpr finder_t() {}
  };
  constexpr finder_t finder{};
  namespace named_operator {
    template<class D>struct make_operator{make_operator(){}};

    template<class T, char, class O> struct half_apply { T&& lhs; };

    template<class Lhs, class Op>
    half_apply<Lhs, '*', Op> operator*( Lhs&& lhs, make_operator<Op> ) {
      return {std::forward<Lhs>(lhs)};
    }

    template<class Lhs, class Op, class Rhs>
    auto operator*( half_apply<Lhs, '*', Op>&& lhs, Rhs&& rhs )
    -> decltype( named_invoke( std::forward<Lhs>(lhs.lhs), Op{}, std::forward<Rhs>(rhs) ) )
    {
      return named_invoke( std::forward<Lhs>(lhs.lhs), Op{}, std::forward<Rhs>(rhs) );
    }
  }
  namespace in_helper {
    struct in_t:notstd::named_operator::make_operator<in_t> {};
    template<class T, class C>
    bool named_invoke( T&& t, in_t, C&& c ) {
      return ::notstd::find(std::forward<C>(c), std::forward<T>(t));
    }
  }
  in_helper::in_t in;
}

On a flat container, like a vector array or string, it is O(n).

On an associative sorted container, like a std::map, std::set, it is O(lg(n)).

On an unordered associated container, like std::unordered_set, it is O(1).

Test code:

std::vector<int> v{1,2,3};
if (1 *in* v)
    std::cout << "yes\n";
if (7 *in* v)
    std::cout << "no\n";
std::map<std::string, std::string, std::less<>> m{
    {"hello", "world"}
};
    
if ("hello" *in* m)
    std::cout << "hello world\n";

Live example.

C++14, but mainly for enable_if_t.

So what is going on here?

Well, can_apply is a bit of code that lets me write can_dot_find, which detects (at compile time) if container.find(x) is a valid expression.

This lets me dispatch the searching code to use member-find if it exists. If it doesn't exist, a linear search using std::find is used instead.

Which is a bit of a lie. If you define a free function find(c, t) in the namespace of your container, it will use that rather than either of the above. But that is me being fancy (and it lets you extend 3rd party containers with *in* support).

That ADL (argument dependent lookup) extensibity (the 3rd party extension ability) is why we have three different functions named find, two in a helper namespace and one in notstd. You are intended to call notstd::find.

Next, we want a python-like in, and what is more python like than an infix operator? To do this in C++ you need to wrap your operator name in other operators. I chose *, so we get an infix *in* named operator.


TL;DR

You do using notstd::in; to import the named operator in.

After that, t *in* c first checks if find(t,c) is valid. If not, it checks if c.find(t) is valid. If that fails, it does a linear search of c using std::begin std::end and std::find.

This gives you very good performance on a wide variety of std containers.

The only thing it doesn't support is

if (7 *in* {1,2,3})

as operators (other than =) cannot deduce initializer lists I believe. You could get

if (7 *in* il(1,2,3))

to work.

Community
  • 1
  • 1
Yakk - Adam Nevraumont
  • 262,606
  • 27
  • 330
  • 524
9

I guess one might make use of this thread and create a custom version of in function.

The main idea is to use SFINAE (Substitution Failure Is Not An Error) to differentiate associative containers (which have key_type member) from sequence containers (which have no key_type member).

Here is a possible implementation:

namespace detail
{
    template<typename, typename = void>
    struct is_associative : std::false_type {};

    template<typename T>
    struct is_associative<T,
        std::enable_if_t<sizeof(typename T::key_type) != 0>> : std::true_type {};

    template<typename C, typename T>
    auto in(const C& container, const T& value) ->
        std::enable_if_t<is_associative<C>::value, bool>
    {
        using std::cend;

        return container.find(value) != cend(container);
    }

    template<typename C, typename T>
    auto in(const C& container, const T& value) ->
        std::enable_if_t<!is_associative<C>::value, bool>
    {
        using std::cbegin;
        using std::cend;

        return std::find(cbegin(container), cend(container), value) != cend(container);
    }

}

template<typename C, typename T>
auto in(const C& container, const T& value)
{
    return detail::in(container, value);
}

Small usage example on WANDBOX.

Edgar Rokjān
  • 17,245
  • 4
  • 40
  • 67
  • 8
    In order to enable SFINAE for this case, you have to add on top of your code `/* deliberately killing KISS */`. – edmz Jun 19 '17 at 11:37
2

You can use std::find from <algorithm>, but this works only for datatypes like: std::map and std::vector (etc).

Also note that this will return, iterator to the first element that is found equal to the value you pass, unlike the in operator in Python that returns a bool.

Freddy Mcloughlan
  • 4,129
  • 1
  • 13
  • 29
pHM
  • 315
  • 5
  • 17
0

I think one of the nice features of the "in" operator in python is that it can be used with different data types (strings v/s strings, numbers v/s lists, etc).

I am developing a library for using python constructions in C++. It includes "in" and "not_in" operators.

It is based on the same technique used to implement the in operator posted in a previous answer, in which make_operator<in_t> is implemented. However, it is extended for handling more cases:

  • Searching a string inside a string
  • Searching an element inside vector and maps

It works by defining several overloads for a function: bool in__(T1 &v1, T2 &v2), in which T1 and T2 consider different possible types of objects. Also, overloads for a function: bool not_in__(T1 &v1, T2 &v2) are defined. Then, the operators "in" and "not_in" call those functions for working.

The implementation is in this repository:

https://github.com/ploncomi/python_like_cpp

Patricio Loncomilla
  • 1,048
  • 3
  • 11
  • 2
    We generally prefer an answer within the answer, rather than a link to an answer somewhere else. Could you provide an explanation instead? – Andy Newman Mar 27 '21 at 17:53
  • 1
    Sorry, but it was not the link to an answer, it is a link to a repository which implements the requested feature (I am working on it). I will modify my answer for being more explicative. – Patricio Loncomilla Apr 02 '21 at 00:21