299

Assuming connectionDetails is a Python dictionary, what's the best, most elegant, most "pythonic" way of refactoring code like this?

if "host" in connectionDetails:
    host = connectionDetails["host"]
else:
    host = someDefaultValue
martineau
  • 119,623
  • 25
  • 170
  • 301
mnowotka
  • 16,430
  • 18
  • 88
  • 134

10 Answers10

447

Like this:

host = connectionDetails.get('host', someDefaultValue)
Solomon Ucko
  • 5,724
  • 3
  • 24
  • 45
MattH
  • 37,273
  • 11
  • 82
  • 84
  • 59
    Note that the second argument is a value, not a key. – Marcin Feb 20 '12 at 09:49
  • 2
    The if/else example is looking up the key in the dictionary twice, while the default example might only be doing one lookup. Besides dictionary lookups potentially being costly in more extreme cases (where you probably shouldn't use Python to begin with), dictionary lookups are function calls too. But I am seeing that the if/else takes about 1/3 less time with my test using string keys and int values in Python 3.4, and I'm not sure why. – sudo Mar 11 '17 at 21:22
  • 1
    One gotcha I found here is that there is a difference between a missing key, which will return the default value, or a key present with the value set to `None` which will not get defaulted. To handle those sorts of values you can do: `host = connectionDetails.get('host')` and then `host = host if host is not None else someDefaultValue` – Eliot Apr 07 '23 at 00:10
142

You can also use the defaultdict like so:

from collections import defaultdict
a = defaultdict(lambda: "default", key="some_value")
a["blabla"] => "default"
a["key"] => "some_value"

You can pass any ordinary function instead of lambda:

from collections import defaultdict
def a():
  return 4

b = defaultdict(a, key="some_value")
b['absent'] => 4
b['key'] => "some_value"
xpmatteo
  • 11,156
  • 3
  • 26
  • 25
tamerlaha
  • 1,902
  • 1
  • 17
  • 25
33

While .get() is a nice idiom, it's slower than if/else (and slower than try/except if presence of the key in the dictionary can be expected most of the time):

>>> timeit.timeit(setup="d={1:2, 3:4, 5:6, 7:8, 9:0}", 
... stmt="try:\n a=d[1]\nexcept KeyError:\n a=10")
0.07691968797894333
>>> timeit.timeit(setup="d={1:2, 3:4, 5:6, 7:8, 9:0}", 
... stmt="try:\n a=d[2]\nexcept KeyError:\n a=10")
0.4583777282275605
>>> timeit.timeit(setup="d={1:2, 3:4, 5:6, 7:8, 9:0}", 
... stmt="a=d.get(1, 10)")
0.17784020746671558
>>> timeit.timeit(setup="d={1:2, 3:4, 5:6, 7:8, 9:0}", 
... stmt="a=d.get(2, 10)")
0.17952161730158878
>>> timeit.timeit(setup="d={1:2, 3:4, 5:6, 7:8, 9:0}", 
... stmt="if 1 in d:\n a=d[1]\nelse:\n a=10")
0.10071221458065338
>>> timeit.timeit(setup="d={1:2, 3:4, 5:6, 7:8, 9:0}", 
... stmt="if 2 in d:\n a=d[2]\nelse:\n a=10")
0.06966537335119938
Tim Pietzcker
  • 328,213
  • 58
  • 503
  • 561
  • 4
    I still don't see _why_ `if/then` would be faster. Both cases require a dictionary lookup, and unless the invocation of `get()` is _so_ much slower, what else accounts for the slowdown? – Jens Mar 13 '15 at 21:33
  • 3
    @Jens: Function calls are expensive. – Tim Pietzcker Mar 13 '15 at 21:35
  • 4
    Which shouldn't be a big deal in a heavily populated dictionary, correct? Meaning the function call is not going to matter much if the actual lookup is costly. It probably only matters in toy examples. – AturSams May 14 '15 at 10:56
  • 2
    @zehelvion: Dictionary lookup is `O(1)` regardless of dictionary size, so the function call overhead is relevant. – Tim Pietzcker May 14 '15 at 11:42
  • 1
    @TimPietzcker Isn't O(1) an idealization that assumes no collisions? Is this a safe assumption for large dictionaries? – irh Jul 05 '15 at 17:59
  • 1
    @irh: Yes, it assumes no collisions. The [amortized worst case](https://wiki.python.org/moin/TimeComplexity) is `O(n)`, but in practice it's rather difficult to construct a dictionary with hash collisions. – Tim Pietzcker Jul 05 '15 at 18:21
  • 53
    it is bizarre if the overhead of calling a function would make you decide against using get. Use what your fellow team members can read best. – Jochen Bedersdorfer Apr 04 '16 at 20:51
  • 2
    Great analysis Tim, it looks like ternary if versions (`a=d[1] if 1 in d else 10` and `a=d[2] if 2 in d else 10`) have the same performance characteristics as the traditional if statement, with Python 3.5.1 at least. – Mark Booth Apr 26 '16 at 21:17
  • 4
    You can improve the speed of the `.get` method a little by caching it, but `.get` is also slow because it catches the `KeyError`; I presume it does that at the C level, but it's still slower than `if...else` if the `KeyError` is likely to be raised more than 10% of the time. See [here](http://stackoverflow.com/a/35451912/4014959) for some `timeit` comparisons between `in` and `.get`. – PM 2Ring Jun 21 '18 at 07:57
  • 2
    @JochenBedersdorfer "premature optimisation" syndrome :) It is good to know the relative performance of different forms to put in for when profiling shows an issue, along with a comment as to why get() isn't used so someone doesn't go replacing with a get() in the future. But otherwise, the easy to understand form should absolutely trump other considerations. – Nick May 04 '20 at 11:24
  • Please avoid pre-mature optimization, get vs. if/else is likely immaterial to the overall performance of what most people are doing. Evaluate your performance by profiling your implementation in your environment in the soup of the rest of your or your team's decisions... – Gibron Apr 05 '23 at 22:46
21

For multiple different defaults try this:

connectionDetails = { "host": "www.example.com" }
defaults = { "host": "127.0.0.1", "port": 8080 }

completeDetails = {}
completeDetails.update(defaults)
completeDetails.update(connectionDetails)
completeDetails["host"]  # ==> "www.example.com"
completeDetails["port"]  # ==> 8080
Jerome Baum
  • 736
  • 6
  • 15
  • 3
    This is a good idiomatic solution, but there is a pitfall. Unexpected outcomes may result if connectionDetails is supplied with `None` or the emptyString as one of the values in the key-value pairs. The `defaults` dictionary could potentially have one of its values unintentionally blanked out. (see also https://stackoverflow.com/questions/6354436) – dreftymac May 29 '17 at 18:05
17

This is not exactly the question asked for but there is a method in python dictionaries: dict.setdefault

    host = connectionDetails.setdefault('host',someDefaultValue)

However this method sets the value of connectionDetails['host'] to someDefaultValue if key host is not already defined, unlike what the question asked.

Sriram
  • 390
  • 3
  • 9
  • 5
    Note that `setdefault()` returns value, so this works as well: `host = connectionDetails.setdefault('host', someDefaultValue)`. Just beware that it will set `connectionDetails['host']` to default value if the key wasn't there before. – ash108 Oct 16 '16 at 18:41
11

(this is a late answer)

An alternative is to subclass the dict class and implement the __missing__() method, like this:

class ConnectionDetails(dict):
    def __missing__(self, key):
        if key == 'host':
            return "localhost"
        raise KeyError(key)

Examples:

>>> connection_details = ConnectionDetails(port=80)

>>> connection_details['host']
'localhost'

>>> connection_details['port']
80

>>> connection_details['password']
Traceback (most recent call last):
  File "python", line 1, in <module>
  File "python", line 6, in __missing__
KeyError: 'password'
Laurent LAPORTE
  • 21,958
  • 6
  • 58
  • 103
  • This is the most obvious method for me but its still a little clumsy when defining the dictionary as you can no longer use dictionary constructors – Stephen Ellwood Jun 07 '22 at 12:18
5

Testing @Tim Pietzcker's suspicion about the situation in PyPy (5.2.0-alpha0) for Python 3.3.5, I find that indeed both .get() and the if/else way perform similar. Actually it seems that in the if/else case there is even only a single lookup if the condition and the assignment involve the same key (compare with the last case where there is two lookups).

>>>> timeit.timeit(setup="d={1:2, 3:4, 5:6, 7:8, 9:0}",
.... stmt="try:\n a=d[1]\nexcept KeyError:\n a=10")
0.011889292989508249
>>>> timeit.timeit(setup="d={1:2, 3:4, 5:6, 7:8, 9:0}",
.... stmt="try:\n a=d[2]\nexcept KeyError:\n a=10")
0.07310474599944428
>>>> timeit.timeit(setup="d={1:2, 3:4, 5:6, 7:8, 9:0}",
.... stmt="a=d.get(1, 10)")
0.010391917996457778
>>>> timeit.timeit(setup="d={1:2, 3:4, 5:6, 7:8, 9:0}",
.... stmt="a=d.get(2, 10)")
0.009348208011942916
>>>> timeit.timeit(setup="d={1:2, 3:4, 5:6, 7:8, 9:0}",
.... stmt="if 1 in d:\n a=d[1]\nelse:\n a=10")
0.011475925013655797
>>>> timeit.timeit(setup="d={1:2, 3:4, 5:6, 7:8, 9:0}",
.... stmt="if 2 in d:\n a=d[2]\nelse:\n a=10")
0.009605801998986863
>>>> timeit.timeit(setup="d={1:2, 3:4, 5:6, 7:8, 9:0}",
.... stmt="if 2 in d:\n a=d[2]\nelse:\n a=d[1]")
0.017342638995614834
Massimiliano Kraus
  • 3,638
  • 5
  • 27
  • 47
Till
  • 51
  • 1
  • 2
5

You can use dict.get() for default values.

d = {"a" :1, "b" :2}
x = d.get("a",5)
y = d.get("c",6)

# This will give
# x = 1, y = 6
# as the result

Since "a" is in the keys, x = d.get("a",5) will return associated value 1. Since "c" is not in the keys, y = d.get("c",6) will return the default value 6.

Kavindu Ravishka
  • 711
  • 4
  • 11
2

You can use a lamba function for this as a one-liner. Make a new object connectionDetails2 which is accessed like a function...

connectionDetails2 = lambda k: connectionDetails[k] if k in connectionDetails.keys() else "DEFAULT"

Now use

connectionDetails2(k)

instead of

connectionDetails[k]

which returns the dictionary value if k is in the keys, otherwise it returns "DEFAULT"

CasualScience
  • 661
  • 1
  • 8
  • 19
0

I am sure that all these answers are ok but it shows that there is no 'nice' way of doing this. I use dictionaries instead of case statements all the time and to add a default clause I just call the following function:

def choose(choise, choises, default):
    """Choose a choice from the choises given
    """
    return choises[choise] if choise in choises else default

That way I can use normal dictionaries without special default clauses etc.

Eric Aya
  • 69,473
  • 35
  • 181
  • 253
Stephen Ellwood
  • 394
  • 1
  • 2
  • 11