Should I use 'has_key()' or 'in' on Python dicts?

Question

Given:

>>> d = {'a': 1, 'b': 2}

Which of the following is the best way to check if 'a' is in d?

>>> 'a' in d
True

>>> d.has_key('a')
True

score 1600 · Accepted Answer · edited Apr 30 '13 at 15:56

1600

in is definitely more pythonic.

In fact has_key() was removed in Python 3.x.

edited Apr 30 '13 at 15:56

bluish

26,356
27
122
180

answered Aug 24 '09 at 16:33

tonfa

24,151
2
35
41

3

As an addition, in Python 3, to check for the existence in values, instead of the keys, try >>> 1 in d.values() – riza Aug 24 '09 at 18:12
270

One semi-gotcha to avoid though is to make sure you do: "key in some_dict" rather than "key in some_dict.keys()". Both are equivalent semantically, but performance-wise the latter is much slower (O(n) vs O(1)). I've seen people do the "in dict.keys()" thinking it's more explicit & therefore better. – Adam Parkin Nov 09 '11 at 20:55
3

@AdamParkin I demonstrated your comment in my answer http://stackoverflow.com/a/41390975/117471 – Bruno Bronosky Dec 30 '16 at 05:17
12

@AdamParkin In Python 3, `keys()` is just a set-like view into a dictionary rather than a copy, so `x in d.keys()` is O(1). Still, `x in d` is more Pythonic. – Arthur Tacca Aug 01 '18 at 08:48
1

@ArthurTacca interesting, so why is ```x in d.keys()``` so much slower than ```x in d```? (see the other answer by @BrunoBronosky with timeit runs) You're right though it does appear to be O(1), but a higher constant factor (I'm seeing about 0.0361 vs 0.133 usec between the two doing the timeit test locally regardless of dict size in Python 3.7) – Adam Parkin Aug 01 '18 at 21:11
3

@AdamParkin Interesting, I didn't see that. I suppose it's because `x in d.keys()` must construct and destroy a temporary object, complete with the memory allocation that entails, where `x in d.keys()` is just doing an arithmetic operation (computing the hash) and doing a lookup. Note that `d.keys()` is only about 10 times as long as this, which is still not long really. I haven't checked but I'm still pretty sure it's only O(1). – Arthur Tacca Aug 02 '18 at 08:40
Is the underlying implementation of `in` equivalent to `dict.get(x) is not None`? – Work of Art Nov 17 '19 at 16:47
3

@WorkofArt It can't be as `None` is a valid dictionary value. – Selcuk Feb 11 '20 at 03:40

score 273 · Answer 2 · answered Aug 24 '09 at 18:12

273

in wins hands-down, not just in elegance (and not being deprecated;-) but also in performance, e.g.:

$ python -mtimeit -s'd=dict.fromkeys(range(99))' '12 in d'
10000000 loops, best of 3: 0.0983 usec per loop
$ python -mtimeit -s'd=dict.fromkeys(range(99))' 'd.has_key(12)'
1000000 loops, best of 3: 0.21 usec per loop

While the following observation is not always true, you'll notice that usually, in Python, the faster solution is more elegant and Pythonic; that's why -mtimeit is SO helpful -- it's not just about saving a hundred nanoseconds here and there!-)

answered Aug 24 '09 at 18:12

Alex Martelli

854,459
170
1,222
1,395

4

Thanks for this, made verifying that "in some_dict" is in fact O(1) much easier (try increasing the 99 to say 1999, and you'll find the runtime is about the same). – Adam Parkin Nov 09 '11 at 21:00
4

`has_key` appears to be O(1) too. – dan-gph Jan 06 '15 at 04:11

score 115 · Answer 3 · edited Apr 11 '13 at 06:27

115

According to python docs:

has_key() is deprecated in favor of key in d.

edited Apr 11 '13 at 06:27

jamylak

128,818
30
231
230

answered Aug 24 '09 at 16:33

Nadia Alramli

111,714
37
173
152

7

`has_key()` is now removed in Python 3 – Vadim Kotov Nov 14 '19 at 15:21

score 46 · Answer 4 · edited Jan 18 '12 at 16:52

46

Use dict.has_key() if (and only if) your code is required to be runnable by Python versions earlier than 2.3 (when key in dict was introduced).

edited Jan 18 '12 at 16:52

Mike Samuel

118,113
30
216
245

answered Aug 24 '09 at 22:11

John Machin

81,303
11
141
189

4

The WebSphere update in 2013 uses Jython 2.1 as its main scripting language. So this is unfortunately still a useful thing to note, five years after you noted it. – ArtOfWarfare Sep 24 '14 at 11:49

score 27 · Answer 5 · answered Jan 18 '12 at 16:45

27

There is one example where in actually kills your performance.

If you use in on a O(1) container that only implements __getitem__ and has_key() but not __contains__ you will turn an O(1) search into an O(N) search (as in falls back to a linear search via __getitem__).

Fix is obviously trivial:

def __contains__(self, x):
    return self.has_key(x)

answered Jan 18 '12 at 16:45

schlenk

7,002
1
25
29

9

This answer was applicable when it was posted, but 99.95% of readers can safely ignore it. In _most_ cases, if you're working with something this obscure you'll know it. – wizzwizz4 Jul 27 '18 at 13:17
4

This really is not an issue. `has_key()` is *specific to Python 2 dictionaries*. `in` / `__contains__` is the correct API to use; for those containers where a full scan is unavoidable there is no `has_key()` method *anyway*, and if there is a O(1) approach then that'll be use-case specific and so up to the developer to pick the right data type for the problem. – Martijn Pieters Jan 05 '19 at 19:25

score 25 · Answer 6 · edited May 11 '16 at 19:46

25

Solution to dict.has_key() is deprecated, use 'in' -- sublime text editor 3

Here I have taken an example of dictionary named 'ages' -

ages = {}

# Add a couple of names to the dictionary
ages['Sue'] = 23

ages['Peter'] = 19

ages['Andrew'] = 78

ages['Karren'] = 45

# use of 'in' in if condition instead of function_name.has_key(key-name).
if 'Sue' in ages:

    print "Sue is in the dictionary. She is", ages['Sue'], "years old"

else:

    print "Sue is not in the dictionary"

edited May 11 '16 at 19:46

answered Feb 23 '16 at 10:29

Greena modi

413
4
2

7

Correct, but it was already answered, welcome to Stackoveflow, thanks for the example, always check the answers though! – igorgue Feb 23 '16 at 19:51
1

@igorgue im not sure about the downvotes to her. Her answer might be similar to the ones already answered, but she provides an example. Isnt that worthy enough to be an answer of SO? – Akshat Agarwal May 22 '16 at 13:34

Bruno Bronosky · Answer 7 · 2017-01-31T16:11:04.713

20

Expanding on Alex Martelli's performance tests with Adam Parkin's comments...

$ python3.5 -mtimeit -s'd=dict.fromkeys(range( 99))' 'd.has_key(12)'
Traceback (most recent call last):
  File "/usr/local/Cellar/python3/3.5.2_3/Frameworks/Python.framework/Versions/3.5/lib/python3.5/timeit.py", line 301, in main
    x = t.timeit(number)
  File "/usr/local/Cellar/python3/3.5.2_3/Frameworks/Python.framework/Versions/3.5/lib/python3.5/timeit.py", line 178, in timeit
    timing = self.inner(it, self.timer)
  File "<timeit-src>", line 6, in inner
    d.has_key(12)
AttributeError: 'dict' object has no attribute 'has_key'

$ python2.7 -mtimeit -s'd=dict.fromkeys(range(  99))' 'd.has_key(12)'
10000000 loops, best of 3: 0.0872 usec per loop

$ python2.7 -mtimeit -s'd=dict.fromkeys(range(1999))' 'd.has_key(12)'
10000000 loops, best of 3: 0.0858 usec per loop

$ python3.5 -mtimeit -s'd=dict.fromkeys(range(  99))' '12 in d'
10000000 loops, best of 3: 0.031 usec per loop

$ python3.5 -mtimeit -s'd=dict.fromkeys(range(1999))' '12 in d'
10000000 loops, best of 3: 0.033 usec per loop

$ python3.5 -mtimeit -s'd=dict.fromkeys(range(  99))' '12 in d.keys()'
10000000 loops, best of 3: 0.115 usec per loop

$ python3.5 -mtimeit -s'd=dict.fromkeys(range(1999))' '12 in d.keys()'
10000000 loops, best of 3: 0.117 usec per loop

edited Jan 31 '17 at 16:11

answered Dec 30 '16 at 05:16

Bruno Bronosky

66,273
12
162
149

Wonderful statistics, _sometimes_ implicit might be better than explicit (at least in efficiency)... – varun Mar 30 '18 at 05:06
Thank you, @varun. I had forgotten about this answer. I need to do this kind of testing more often. I regularly read long threads where people argue about **The Best Way™** to do things. But I rarely remember how easy this was to get **proof**. – Bruno Bronosky Mar 30 '18 at 14:47
this experiment has a defect, it mixed the dict creation time with the key searching time. it is better to separate the two to measure the time spent on key searching only. once you separate the two, the timing result would show that both 'key in D' and 'key in D.keys()' appear to be O(1). No essential difference, although key in D.keys() is a bit slower than key in D, it is not O(N) vs O(1). – water stone Sep 03 '20 at 22:56
i used python3, so the conclusion i had was for python3 (in python2 likely it is O(N) vs O(1)), but i did not see this in python3. – water stone Sep 03 '20 at 23:54

score 15 · Answer 8 · answered Aug 24 '09 at 18:35

15

has_key is a dictionary method, but in will work on any collection, and even when __contains__ is missing, in will use any other method to iterate the collection to find out.

answered Aug 24 '09 at 18:35

u0b34a0f6ae

48,117
14
92
101

1

And does also work on iterators "x in xrange(90, 200) <=> 90 <= x < 200" – u0b34a0f6ae Aug 28 '09 at 13:21
1

…: This looks like a very bad idea: 50 operations instead of 2. – Clément Sep 22 '16 at 22:12
1

@Clément In Python 3, it's actually quite efficient to do `in` tests on `range` objects. I'm not so sure about its efficiency on Python 2 `xrange`, though. ;) – PM 2Ring Nov 29 '18 at 18:00
1

@Clément not in Python 3; `__contains__` can trivially *calculate* if a value is in the range or not. – Martijn Pieters Jan 05 '19 at 19:21
@PM2Ring Not necessarily. Try `1.0 in range(10**2, 0, -1)` and then try `1.0 in range(10**10, 0, -1)` – wim Jan 05 '19 at 23:15
@MartijnPieters: I think you misread my comment. I'm answering the first comment, which uses `x in xrange(…)`, which is distinctly not python3 and distinctly a bad idea. – Clément Jan 06 '19 at 17:27
@Clément yes, you are using `xrange` but while a lot of people know to translate that to `range()` in Python 3 not everyone is aware that there `range()` containment testing is plenty efficient. – Martijn Pieters Jan 06 '19 at 17:58
@PM2Ring (Python 3.7) `timeit 90 in range(10, 500)` --> 321 ns ± 34.1 ns per loop (mean ± std. dev. of 7 runs, 1000000 loops each) `timeit 10 <= 90 < 500` --> 46 ns ± 0.0847 ns per loop (mean ± std. dev. of 7 runs, 10000000 loops each) I’m still wondering where `in` tests on `range` are "quite" efficient w.r.t. comparison operators. It’s 7 times slower. – Alexandre Huat Feb 10 '20 at 14:42
1

@AlexandreHuat Your timing includes the overhead of creating a new ``range`` instance each time. Using a single, *pre-existing* instance the "integer in range" test is about 40% faster in my timings. – MisterMiyagi Feb 10 '20 at 15:54
@MisterMiyagi I took the first comment literally but in the _pre-instanciated_ case you're right. I reduced the time to ~100 ns with `timeit -s "r = range(10, 500)" "90 in r"` which is equivalent to `timeit -s "r = range(10, 500)" "r.start <= 90 < r.stop"`. – Alexandre Huat Feb 11 '20 at 18:40

score -6 · Answer 9 · edited Nov 23 '22 at 08:30

-6

If you have something like this:

t.has_key(ew)

change it to below for running on Python 3.X and above:

key = ew
if key not in t

edited Nov 23 '22 at 08:30

Gino Mempin

25,369
29
96
135

answered Jan 24 '17 at 00:21

Harshita Jhavar

147
8

11

No, you inverted the test. `t.has_key(ew)` returns `True` if the value `ew` references is also a key in the dictionary. `key not in t` returns `True` if the value is ***not*** in the dictionary. Moreover, the `key = ew` alias is very, very redundant. The correct spelling is `if ew in t`. Which is what the accepted answer from 8 years prior already told you. – Martijn Pieters Jan 05 '19 at 19:17

Should I use 'has_key()' or 'in' on Python dicts?

9 Answers9

Linked

Related