Dictionary value initialization on lookup

Question

Consider the following:

mylist = ['a','b','c','d']
mydict = {}
for val in mylist:
    if val not in mydict:
        mydict[val] = []
    mydict[val].append(1)

Is there any way to avoid the double lookup ("val in mydict" and "mydict[val]")?

Note - I've tried using a default return value (i.e. mydict.get(val, []).append(1)), but the new list isn't actually registered in the dictionary as the value for the key. For example:

mydict = {}
mydict.get('not_in_dict','default')
mydict

returns {}. In my case I want something that would return {'not_in_dict' : 'default'}.

Or is the correct answer here that I should't worry about double lookups? (e.g. "chill, dude, don't optimize if you don't have to" or "python is awesome, it takes care of this already").

I'm using python 3.4.

@Ashwini - thank you for pointing out the duplicate question. The answer there was just what I was looking for. — Gordon Bean, Sep 30 '14 at 22:36

Padraic Cunningham · Accepted Answer · 2014-09-30T22:35:03.543

2

You can use dict.setdefault

mylist = ['a','b','c','d']
mydict = {}
for val in mylist:
    mydict.setdefault(val,[])
    mydict[val].append(1)

Or use collections.defaultdict which is more efficient

from collections import defaultdict
mylist = ['a','b','c','d']
mydict = defaultdict(list)
for val in mylist:
    mydict[val].append(1)


In [14]: mylist = ['a','b','c','d']    
In [15]: mydict = {}    
In [16]: %%timeit
   ....: for val in mylist:
   ....:     mydict.setdefault(val,[])
   ....:     mydict[val].append(1)
   ....: 
1000000 loops, best of 3: 1.51 µs per loop

In [18]: mydict = defaultdict(list)
In [19]: %%timeit
   ....: for val in mylist:
   ....:     mydict[val].append(1)
   ....: 
1000000 loops, best of 3: 603 ns per loop

edited Sep 30 '14 at 22:35

answered Sep 30 '14 at 22:28

Padraic Cunningham

176,452
29
245
321

It appears that your example is still doing two lookups: mydict.setdefault(val,[]) and then again with mydict[val].append(1). Does the time change when you chain the two commands: mydict.setdefault(val,[]).append(1)? – Gordon Bean Sep 30 '14 at 22:39
the most efficient solution would be `defaultdict`, your original code is not far behind, checking if a key exists is constant time so it is pretty cheap to check if the keys exists so your own code is pretty efficient. – Padraic Cunningham Sep 30 '14 at 22:43
I guess then the answer really is that it doesn't matter all that much. I'm sure there is a python quote somewhere out there about premature optimization being the root of all evil. :) – Gordon Bean Sep 30 '14 at 22:45
There sure is, defaultdict is a handy tool though so worth learning about. As far as optimization goes I always get my code completed and working correctly then worry about optimisation if needed. – Padraic Cunningham Sep 30 '14 at 22:50

score 0 · Answer 2 · answered Sep 30 '14 at 22:28

0

from collections import defaultdict

mydict = defaultdict(list)
mydict['not in dict'] = 'foo'

answered Sep 30 '14 at 22:28

m.wasowski

6,329
1
23
30

Dictionary value initialization on lookup

2 Answers2