Set value in dictionary if value not there?

Question

I am using a dictionary to store a bunch of counters where each counter is counting the occurence of a file type (.wav, .mp3, etc).

filetypecounter = {}

When I come across a certain file type I want to be able to increase a counter in a pythonic way. So I am thinking...

filetypecounter[filetype] +=1

However, if the filetype is not in the dictionary I want to instantiate it to 1. So my logic is if filetype counter is there, add 1 to the counter value, else set it to one.

if filetype not in filetypecounter:
    filetypecounter[filetype] = 1
else: 
    filetypecounter[filetype] +=1

Is there a more pythonic way?

Rob Cowie · Accepted Answer · 2013-02-23T20:32:06.090

from collections import defaultdict

filetypecounter = defaultdict(int)
filetypecounter[filetype] += 1

or

from collections import Counter

filetypecounter = Counter()
filetypecounter.update([filetype])

For info, if you must use a dict, your solution (checking if the key is present) is a reasonable one. Perhaps a more 'pythonic' solution might be:

filetypecounter = {}
filetypecounter[filetype] = filetypecounter.get(filetype, 0) + 1

Really though, this and other suggestions are just variations o the same theme. I'd use the Counter.

score 2 · Answer 2 · answered Feb 23 '13 at 20:23

2

It looks like what you want is collections.defaultdict, or collections.Counter for Python 2.7 and up.

answered Feb 23 '13 at 20:23

BrenBarn

242,874
37
412
384

dawg · Answer 3 · 2013-02-23T22:56:26.753

Well using collections.Counter is well covered in this group of answers, but that may not be the fastest choice.

One older way is this:

>>> d={}
>>> for ext in ('.mp3','.mp3','.m4a','.mp3','.wav','.m4a'):
...    d[ext]=d.setdefault(ext,0)+1
... 
>>> d
{'.mp3': 3, '.wav': 1, '.m4a': 2}

That is not the fastest either, but it is faster than collections.Counter

There are benchmarks of these methods and either defaultdict, try/except or your original method are the fastest.

I have reproduced (and expanded) the benchmark here:

import urllib2
import timeit

response = urllib2.urlopen('http://pastebin.com/raw.php?i=7p3uycAz')
hamlet = response.read().replace('\r\n','\n')
LETTERS = [w for w in hamlet]
WORDS = hamlet.split(' ')
fmt='{:>20}: {:7.4} seconds for {} loops'
n=100
print
t = timeit.Timer(stmt="""
        counter = defaultdict(int)
        for k in LETTERS:
            counter[k] += 1 
        """,
        setup="from collections import defaultdict; from __main__ import LETTERS")

print fmt.format("defaultdict letters",t.timeit(n),n)
t = timeit.Timer(stmt="""
        counter = defaultdict(int)
        for k in WORDS:
            counter[k] += 1 
        """,
        setup="from collections import defaultdict; from __main__ import WORDS")

print fmt.format("defaultdict words",t.timeit(n),n)
print

# setdefault
t = timeit.Timer(stmt="""
        counter = {}
        for k in LETTERS:
            counter[k]=counter.setdefault(k, 0)+1
        """,
        setup="from __main__ import LETTERS")
print fmt.format("setdefault letters",t.timeit(n),n)
t = timeit.Timer(stmt="""
        counter = {}
        for k in WORDS:
            counter[k]=counter.setdefault(k, 0)+1
        """,
        setup="from __main__ import WORDS")
print fmt.format("setdefault words",t.timeit(n),n)
print

# Counter
t = timeit.Timer(stmt="c = Counter(LETTERS)",
        setup="from collections import Counter; from __main__ import LETTERS")

print fmt.format("Counter letters",t.timeit(n),n)
t = timeit.Timer(stmt="c = Counter(WORDS)",
        setup="from collections import Counter; from __main__ import WORDS")
print fmt.format("Counter words",t.timeit(n),n)
print

# in
t = timeit.Timer(stmt="""
        counter = {}
        for k in LETTERS:
            if k in counter: counter[k]+=1
            else: counter[k]=1   
        """,
        setup="from __main__ import LETTERS")
print fmt.format("'in' letters",t.timeit(n),n)
t = timeit.Timer(stmt="""
        counter = {}
        for k in WORDS:
            if k in counter: counter[k]+=1
            else: counter[k]=1   
        """,
        setup="from __main__ import WORDS")
print fmt.format("'in' words",t.timeit(n),n)
print

# try
t = timeit.Timer(stmt="""
        counter = {}
        for k in LETTERS:
            try:
                counter[k]+=1
            except KeyError:
                counter[k]=1     
        """,
        setup="from __main__ import LETTERS")
print fmt.format("try letters",t.timeit(n),n)
t = timeit.Timer(stmt="""
        counter = {}
        for k in WORDS:
            try:
                counter[k]+=1
            except KeyError:
                counter[k]=1             """,
        setup="from __main__ import WORDS")
print fmt.format("try words",t.timeit(n),n)
print "\n{:,} letters and {:,} words".format(len(list(LETTERS)),len(list(WORDS)))

Prints:

 defaultdict letters:   3.001 seconds for 100 loops
   defaultdict words:  0.8495 seconds for 100 loops

  setdefault letters:   4.839 seconds for 100 loops
    setdefault words:   0.946 seconds for 100 loops

     Counter letters:   7.335 seconds for 100 loops
       Counter words:   1.298 seconds for 100 loops

        'in' letters:   4.013 seconds for 100 loops
          'in' words:  0.7275 seconds for 100 loops

         try letters:   3.389 seconds for 100 loops
           try words:   1.571 seconds for 100 loops

175,176 letters and 26,630 words

Personally I was surprised that try except is one of the fastest ways to do this. Who knew...

ASGM · Answer 4 · 2013-02-23T20:30:51.637

1

An alternative method would be a try / except clause:

try: 
    filetypecounter[filetype] += 1
except KeyError:
    filetypecounter[filetype] = 1

If you have fewer filetypes than files, this method is more efficient, because instead of unnecessarily checking if the filetype is in filetypecounter, you assume that this is the case and only create a new entry in filetypecounter when it is not.

Edit: Added KeyError in response to comment from @delnan.

edited Feb 23 '13 at 20:30

answered Feb 23 '13 at 20:25

ASGM

11,051
1
32
53

1

Bare `except:` is bad because it catches more exceptions than sensible, hiding bugs (in this case, an example of a bug that might be hidden is a typo in `filetypecounter` or `filetype`, or the dictionary containing values that can't be incremented). – Feb 23 '13 at 20:27
Thanks @delnan, answer duly modified. – ASGM Feb 23 '13 at 20:31

score 1 · Answer 5 · answered Feb 23 '13 at 20:25

1

I guess all you need is counter module.

answered Feb 23 '13 at 20:25

aemdy

3,702
6
34
49

martineau · Answer 6 · 2013-02-23T22:09:35.390

1

The collections.Counter class does exactly what you (really) need.

edited Feb 23 '13 at 22:09

answered Feb 23 '13 at 20:28

martineau

119,623
25
170
301

Markus Unterwaditzer · Answer 7 · 2013-02-23T20:36:39.767

0

I think you want defaultdict:

from collections import defaultdict

d = defaultdict(lambda: 0)
d['foo'] += 1
# d['foo'] is now 1

Another idea is to use dict.setdefault:

d = {}
d.setdefault('foo', 0)  # won't override if 'foo' is already in d
d['foo'] += 1

edited Feb 23 '13 at 20:36

answered Feb 23 '13 at 20:28

Markus Unterwaditzer

7,992
32
60

Set value in dictionary if value not there?

7 Answers7