1105

Given:

>>> d = {'a': 1, 'b': 2}

Which of the following is the best way to check if 'a' is in d?

>>> 'a' in d
True
>>> d.has_key('a')
True
Mateen Ulhaq
  • 24,552
  • 19
  • 101
  • 135
igorgue
  • 17,884
  • 13
  • 37
  • 54

9 Answers9

1600

in is definitely more pythonic.

In fact has_key() was removed in Python 3.x.

bluish
  • 26,356
  • 27
  • 122
  • 180
tonfa
  • 24,151
  • 2
  • 35
  • 41
  • 3
    As an addition, in Python 3, to check for the existence in values, instead of the keys, try >>> 1 in d.values() – riza Aug 24 '09 at 18:12
  • 270
    One semi-gotcha to avoid though is to make sure you do: "key in some_dict" rather than "key in some_dict.keys()". Both are equivalent semantically, but performance-wise the latter is much slower (O(n) vs O(1)). I've seen people do the "in dict.keys()" thinking it's more explicit & therefore better. – Adam Parkin Nov 09 '11 at 20:55
  • 3
    @AdamParkin I demonstrated your comment in my answer http://stackoverflow.com/a/41390975/117471 – Bruno Bronosky Dec 30 '16 at 05:17
  • 12
    @AdamParkin In Python 3, `keys()` is just a set-like view into a dictionary rather than a copy, so `x in d.keys()` is O(1). Still, `x in d` is more Pythonic. – Arthur Tacca Aug 01 '18 at 08:48
  • 1
    @ArthurTacca interesting, so why is ```x in d.keys()``` so much slower than ```x in d```? (see the other answer by @BrunoBronosky with timeit runs) You're right though it does appear to be O(1), but a higher constant factor (I'm seeing about 0.0361 vs 0.133 usec between the two doing the timeit test locally regardless of dict size in Python 3.7) – Adam Parkin Aug 01 '18 at 21:11
  • 3
    @AdamParkin Interesting, I didn't see that. I suppose it's because `x in d.keys()` must construct and destroy a temporary object, complete with the memory allocation that entails, where `x in d.keys()` is just doing an arithmetic operation (computing the hash) and doing a lookup. Note that `d.keys()` is only about 10 times as long as this, which is still not long really. I haven't checked but I'm still pretty sure it's only O(1). – Arthur Tacca Aug 02 '18 at 08:40
  • Is the underlying implementation of `in` equivalent to `dict.get(x) is not None`? – Work of Art Nov 17 '19 at 16:47
  • 3
    @WorkofArt It can't be as `None` is a valid dictionary value. – Selcuk Feb 11 '20 at 03:40
273

in wins hands-down, not just in elegance (and not being deprecated;-) but also in performance, e.g.:

$ python -mtimeit -s'd=dict.fromkeys(range(99))' '12 in d'
10000000 loops, best of 3: 0.0983 usec per loop
$ python -mtimeit -s'd=dict.fromkeys(range(99))' 'd.has_key(12)'
1000000 loops, best of 3: 0.21 usec per loop

While the following observation is not always true, you'll notice that usually, in Python, the faster solution is more elegant and Pythonic; that's why -mtimeit is SO helpful -- it's not just about saving a hundred nanoseconds here and there!-)

Alex Martelli
  • 854,459
  • 170
  • 1,222
  • 1,395
115

According to python docs:

has_key() is deprecated in favor of key in d.

jamylak
  • 128,818
  • 30
  • 231
  • 230
Nadia Alramli
  • 111,714
  • 37
  • 173
  • 152
46

Use dict.has_key() if (and only if) your code is required to be runnable by Python versions earlier than 2.3 (when key in dict was introduced).

Mike Samuel
  • 118,113
  • 30
  • 216
  • 245
John Machin
  • 81,303
  • 11
  • 141
  • 189
  • 4
    The WebSphere update in 2013 uses Jython 2.1 as its main scripting language. So this is unfortunately still a useful thing to note, five years after you noted it. – ArtOfWarfare Sep 24 '14 at 11:49
27

There is one example where in actually kills your performance.

If you use in on a O(1) container that only implements __getitem__ and has_key() but not __contains__ you will turn an O(1) search into an O(N) search (as in falls back to a linear search via __getitem__).

Fix is obviously trivial:

def __contains__(self, x):
    return self.has_key(x)
schlenk
  • 7,002
  • 1
  • 25
  • 29
  • 9
    This answer was applicable when it was posted, but 99.95% of readers can safely ignore it. In _most_ cases, if you're working with something this obscure you'll know it. – wizzwizz4 Jul 27 '18 at 13:17
  • 4
    This really is not an issue. `has_key()` is *specific to Python 2 dictionaries*. `in` / `__contains__` is the correct API to use; for those containers where a full scan is unavoidable there is no `has_key()` method *anyway*, and if there is a O(1) approach then that'll be use-case specific and so up to the developer to pick the right data type for the problem. – Martijn Pieters Jan 05 '19 at 19:25
25

Solution to dict.has_key() is deprecated, use 'in' -- sublime text editor 3

Here I have taken an example of dictionary named 'ages' -

ages = {}

# Add a couple of names to the dictionary
ages['Sue'] = 23

ages['Peter'] = 19

ages['Andrew'] = 78

ages['Karren'] = 45

# use of 'in' in if condition instead of function_name.has_key(key-name).
if 'Sue' in ages:

    print "Sue is in the dictionary. She is", ages['Sue'], "years old"

else:

    print "Sue is not in the dictionary"
Greena modi
  • 413
  • 4
  • 2
  • 7
    Correct, but it was already answered, welcome to Stackoveflow, thanks for the example, always check the answers though! – igorgue Feb 23 '16 at 19:51
  • 1
    @igorgue im not sure about the downvotes to her. Her answer might be similar to the ones already answered, but she provides an example. Isnt that worthy enough to be an answer of SO? – Akshat Agarwal May 22 '16 at 13:34
20

Expanding on Alex Martelli's performance tests with Adam Parkin's comments...

$ python3.5 -mtimeit -s'd=dict.fromkeys(range( 99))' 'd.has_key(12)'
Traceback (most recent call last):
  File "/usr/local/Cellar/python3/3.5.2_3/Frameworks/Python.framework/Versions/3.5/lib/python3.5/timeit.py", line 301, in main
    x = t.timeit(number)
  File "/usr/local/Cellar/python3/3.5.2_3/Frameworks/Python.framework/Versions/3.5/lib/python3.5/timeit.py", line 178, in timeit
    timing = self.inner(it, self.timer)
  File "<timeit-src>", line 6, in inner
    d.has_key(12)
AttributeError: 'dict' object has no attribute 'has_key'

$ python2.7 -mtimeit -s'd=dict.fromkeys(range(  99))' 'd.has_key(12)'
10000000 loops, best of 3: 0.0872 usec per loop

$ python2.7 -mtimeit -s'd=dict.fromkeys(range(1999))' 'd.has_key(12)'
10000000 loops, best of 3: 0.0858 usec per loop

$ python3.5 -mtimeit -s'd=dict.fromkeys(range(  99))' '12 in d'
10000000 loops, best of 3: 0.031 usec per loop

$ python3.5 -mtimeit -s'd=dict.fromkeys(range(1999))' '12 in d'
10000000 loops, best of 3: 0.033 usec per loop

$ python3.5 -mtimeit -s'd=dict.fromkeys(range(  99))' '12 in d.keys()'
10000000 loops, best of 3: 0.115 usec per loop

$ python3.5 -mtimeit -s'd=dict.fromkeys(range(1999))' '12 in d.keys()'
10000000 loops, best of 3: 0.117 usec per loop
Bruno Bronosky
  • 66,273
  • 12
  • 162
  • 149
  • Wonderful statistics, _sometimes_ implicit might be better than explicit (at least in efficiency)... – varun Mar 30 '18 at 05:06
  • Thank you, @varun. I had forgotten about this answer. I need to do this kind of testing more often. I regularly read long threads where people argue about **The Best Way™** to do things. But I rarely remember how easy this was to get **proof**. – Bruno Bronosky Mar 30 '18 at 14:47
  • this experiment has a defect, it mixed the dict creation time with the key searching time. it is better to separate the two to measure the time spent on key searching only. once you separate the two, the timing result would show that both 'key in D' and 'key in D.keys()' appear to be O(1). No essential difference, although key in D.keys() is a bit slower than key in D, it is not O(N) vs O(1). – water stone Sep 03 '20 at 22:56
  • i used python3, so the conclusion i had was for python3 (in python2 likely it is O(N) vs O(1)), but i did not see this in python3. – water stone Sep 03 '20 at 23:54
15

has_key is a dictionary method, but in will work on any collection, and even when __contains__ is missing, in will use any other method to iterate the collection to find out.

u0b34a0f6ae
  • 48,117
  • 14
  • 92
  • 101
  • 1
    And does also work on iterators "x in xrange(90, 200) <=> 90 <= x < 200" – u0b34a0f6ae Aug 28 '09 at 13:21
  • 1
    …: This looks like a very bad idea: 50 operations instead of 2. – Clément Sep 22 '16 at 22:12
  • 1
    @Clément In Python 3, it's actually quite efficient to do `in` tests on `range` objects. I'm not so sure about its efficiency on Python 2 `xrange`, though. ;) – PM 2Ring Nov 29 '18 at 18:00
  • 1
    @Clément not in Python 3; `__contains__` can trivially *calculate* if a value is in the range or not. – Martijn Pieters Jan 05 '19 at 19:21
  • @PM2Ring Not necessarily. Try `1.0 in range(10**2, 0, -1)` and then try `1.0 in range(10**10, 0, -1)` – wim Jan 05 '19 at 23:15
  • @MartijnPieters: I think you misread my comment. I'm answering the first comment, which uses `x in xrange(…)`, which is distinctly not python3 and distinctly a bad idea. – Clément Jan 06 '19 at 17:27
  • @Clément yes, you are using `xrange` but while a lot of people know to translate that to `range()` in Python 3 not everyone is aware that there `range()` containment testing is plenty efficient. – Martijn Pieters Jan 06 '19 at 17:58
  • @PM2Ring (Python 3.7) `timeit 90 in range(10, 500)` --> 321 ns ± 34.1 ns per loop (mean ± std. dev. of 7 runs, 1000000 loops each) `timeit 10 <= 90 < 500` --> 46 ns ± 0.0847 ns per loop (mean ± std. dev. of 7 runs, 10000000 loops each) I’m still wondering where `in` tests on `range` are "quite" efficient w.r.t. comparison operators. It’s 7 times slower. – Alexandre Huat Feb 10 '20 at 14:42
  • 1
    @AlexandreHuat Your timing includes the overhead of creating a new ``range`` instance each time. Using a single, *pre-existing* instance the "integer in range" test is about 40% faster in my timings. – MisterMiyagi Feb 10 '20 at 15:54
  • @MisterMiyagi I took the first comment literally but in the _pre-instanciated_ case you're right. I reduced the time to ~100 ns with `timeit -s "r = range(10, 500)" "90 in r"` which is equivalent to `timeit -s "r = range(10, 500)" "r.start <= 90 < r.stop"`. – Alexandre Huat Feb 11 '20 at 18:40
-6

If you have something like this:

t.has_key(ew)

change it to below for running on Python 3.X and above:

key = ew
if key not in t
Gino Mempin
  • 25,369
  • 29
  • 96
  • 135
  • 11
    No, you inverted the test. `t.has_key(ew)` returns `True` if the value `ew` references is also a key in the dictionary. `key not in t` returns `True` if the value is ***not*** in the dictionary. Moreover, the `key = ew` alias is very, very redundant. The correct spelling is `if ew in t`. Which is what the accepted answer from 8 years prior already told you. – Martijn Pieters Jan 05 '19 at 19:17