Complexity of in operator in Python

Question

What is the complexity of the in operator in Python? Is it theta(n)?

Is it the same as the following?

def find(L, x):
   for e in L:
       if e == x:
           return True
   return False

L is a list.

It depends on the type of container, since using it with a dictionary or set will be much faster than with an array. — Greg Hewgill, Dec 14 '12 at 18:19
See the comments thread at the [ActiveState snippet _List with fast membership test (__contains__)_](http://code.activestate.com/recipes/577185-list-with-fast-membership-test-__contains__/). Summary: do timing tests. — Jim DeLaHunt, Dec 14 '12 at 18:30
@Rastegar `L` doesn't imply a list. `seq` is the most common choice where one wants to imply a list. `L` is a terrible variable name. Single letter ones are bad, and the capital implies it's a class. Even if it was something in particular, Python is dynamic, so state it explicitly in a case like this. — Gareth Latty, Dec 14 '12 at 18:50
@GarethLatty Using *lst* is also a good name to define a `list` — vmemmap, Sep 24 '20 at 20:02
For a normal list, your algorithm should be slower - not because it is semantically less efficient, but because Python will the "in" interpret in C as one function, while in yours, every line would be interpreted separately. (Of course, there are probably some optimizations...) — Ferazhu, Apr 09 '21 at 09:37

score 225 · Accepted Answer · edited Jul 31 '18 at 00:32

225

The complexity of in depends entirely on what L is. e in L will become L.__contains__(e).

See this time complexity document for the complexity of several built-in types.

Here is the summary for in:

list - Average: O(n)
set/dict - Average: O(1), Worst: O(n)

The O(n) worst case for sets and dicts is very uncommon, but it can happen if __hash__ is implemented poorly. This only happens if everything in your set has the same hash value.

edited Jul 31 '18 at 00:32

Peter Mortensen

30,738
21
105
131

answered Dec 14 '12 at 18:20

Andrew Clark

202,379
35
273
306

1

Does anyone happen to know the complexity of the "in" operator for an OrderedDict? – Josh Sherick Aug 04 '15 at 19:19
1

After some testing, I can confirm that the complexity of OrderedDict in Python 2.7 appears to be O(1) in the average case. – Josh Sherick Aug 04 '15 at 19:56
@Josh Sherick you don't have to provide tests, all you need are the sources of the [`OrderedDict`](https://svn.python.org/projects/python/trunk/Lib/collections.py), and as you could find out: `OrderedDict` is inherited from `dict`, so the most operations (of course, with exceptions) have **the same complexity**. – maxkoryukov Sep 12 '18 at 13:46
Is the time complexity of "in" operator O(n) for tuple as well? – Inherited Geek May 25 '19 at 17:54
What is the time complexity of "in" operator when used with a generator? – Sreekanth Reddy Balne Jun 02 '20 at 14:51
@InheritedGeek yes it is – theSekyi Oct 28 '20 at 01:36
1

@whitehat linear. – juanpa.arrivillaga Dec 09 '20 at 05:05

kindall · Answer 2 · 2014-04-24T15:08:02.763

22

It depends entirely on the type of the container. Hashing containers (dict, set) use the hash and are essentially O(1). Typical sequences (list, tuple) are implemented as you guess and are O(n). Trees would be average O(log n). And so on. Each of these types would have an appropriate __contains__ method with its big-O characteristics.

edited Apr 24 '14 at 15:08

answered Dec 14 '12 at 18:19

kindall

178,883
35
278
309

of value is to include the overhead of generating the hash. – Woot4Moo Dec 14 '12 at 18:20
Hashing data types include `dict` and `set` (as wells as potentially others) – Dave Dec 14 '12 at 18:20
2

@Woot4Moo: When you're talking about asymptotic complexity, that isn't relevant. The overhead of generating the hash is constant. When you're dealing with small values of N, profiling becomes important, because, say, 100 >> 2N for small N. But that's a separate issue from what the OP was asking about; for huge N, 100 << 2N, which is what complexity is all about. – abarnert Dec 14 '12 at 19:15
@abarnert well it actually is quite relevant, as you don't arbitrarily choose data structures. You must consider the use and most common ways the structure will be used, so it actually is important to consider the amount of time for a hash function, especially in a scenario where the has must be computed per iteration of a program. – Woot4Moo Dec 14 '12 at 19:17
@Woot4Moo: If someone is asking about asymptotic complexity, either (a) they expect to deal with a large N, or (b) they're an idiot. I'm assuming the OP is case (a), but either way, the constant factor isn't relevant to the answer. – abarnert Dec 14 '12 at 20:34
I would expect the overhead of hashing to not be quite constant (the number of bytes being hashed would be a factor) but we can probably treat it as such so long as that doesn't correlate strongly with *n.* – kindall May 03 '22 at 14:51

score -1 · Answer 3 · answered Dec 14 '12 at 18:20

-1

It depends on the container you're testing. It's usually what you'd expect - linear for ordered datastructures, constant for the unordered. Of course, there are both types (ordered or unordered) which might be backed by some variant of a tree.

answered Dec 14 '12 at 18:20

Marcin

48,559
18
128
201

@ZoranPavlovic `A in B` tests whether `A` is in `B`. – Marcin Apr 24 '14 at 13:50
1

I'd definitely expect logarithmic time in an ordered structure. – dedObed Dec 13 '19 at 09:21
@dedObed Why would you expect that? Do you expect python to already know whether or not your data are sorted? – Marcin Dec 13 '19 at 19:18
Because if there is a container designed to be ordered, the obvious reason is to allow for logarithmic lookups. But I guess it's just a naming issue, I'd use "linear" where you wrote "ordered" and all would be fine. (In my head -- English as second language here.) – dedObed Dec 14 '19 at 10:34

Complexity of in operator in Python

3 Answers3

Linked

Related

Complexity of *in* operator in Python

3 Answers3

Linked

Related

Complexity of in operator in Python