2

I am trying to track seen elements, from a big array, using a dict. Is there a way to force a dictionary object to be integer type and set to zero by default upon initialization?

I have done this with a very clunky codes and two loops.

Here is what I do now:

fl = [0, 1, 1, 2, 1, 3, 4]
seenit = {}

for val in fl:
    seenit[val] = 0

for val in fl:
    seenit[val] = seenit[val] + 1
snakecharmerb
  • 47,570
  • 11
  • 100
  • 153
giggle usa
  • 31
  • 1
  • 2
  • 4
    *"force a dictionary ... set to zero by default upon initialization"* -- [`defaultdict`](https://docs.python.org/2/library/collections.html#collections.defaultdict) – meowgoesthedog Mar 24 '19 at 14:51
  • There's `seenit = dict.fromkeys(fl, 0)` to replace the first loop, but the current answers provide better solutions for replacing both loops at the same time. – chepner Mar 24 '19 at 15:05

3 Answers3

4

Of course, just use collections.defaultdict([default_factory[, ...]]):

from collections import defaultdict

fl = [0, 1, 1, 2, 1, 3, 4]

seenit = defaultdict(int)

for val in fl:
    seenit[val] += 1

print(fl)
# Output
defaultdict(<class 'int'>, {0: 1, 1: 3, 2: 1, 3: 1, 4: 1})

print(dict(seenit))
# Output
{0: 1, 1: 3, 2: 1, 3: 1, 4: 1}

In addition, if you don't like to import collections you can use dict.get(key[, default])

fl = [0, 1, 1, 2, 1, 3, 4]

seenit = {}

for val in fl:
    seenit[val] = seenit.get(val, 0) + 1

print(seenit)
# Output
{0: 1, 1: 3, 2: 1, 3: 1, 4: 1}

Also, if you only want to solve the problem and don't mind to use exactly dictionaries you may use collection.counter([iterable-or-mapping]):

from collections import Counter

fl = [0, 1, 1, 2, 1, 3, 4]

seenit = Counter(f)

print(seenit)
# Output
Counter({1: 3, 0: 1, 2: 1, 3: 1, 4: 1})

print(dict(seenit))
# Output
{0: 1, 1: 3, 2: 1, 3: 1, 4: 1}

Both collection.defaultdict and collection.Counter can be read as dictionary[key] and supports the usage of .keys(), .values(), .items(), etc. Basically they are a subclass of a common dictionary.

If you want to talk about performance I checked with timeit.timeit() the creation of the dictionary and the loop for a million of executions:

  • collection.defaultdic: 2.160868141 seconds
  • dict.get: 1.3540439499999999 seconds
  • collection.Counter: 4.700308418999999 seconds

collection.Counter may be easier, but much slower.

Ender Look
  • 2,303
  • 2
  • 17
  • 41
  • Thanks. This newbie appreciates all the help. I like the defaultdic solution and will go with this. – giggle usa Mar 25 '19 at 18:30
  • @giggleusa if you find this or another answer useful you should consider accepting the answer as the correct one (pressing the gray mark under the downvote button). You would receive 2 reputation for accepting one. Otherwise, your question will stay *unanswered*. – Ender Look Mar 25 '19 at 18:38
4

You can use collections.Counter:

from collections import Counter
Counter([0, 1, 1, 2, 1, 3, 4])

Output:

Counter({1: 3, 0: 1, 2: 1, 3: 1, 4: 1})

You can then address it like a dictionary:

>>> Counter({1: 3, 0: 1, 2: 1, 3: 1, 4: 1})[1]
3
>>> Counter({1: 3, 0: 1, 2: 1, 3: 1, 4: 1})[0]
1
PrinceOfCreation
  • 389
  • 1
  • 12
1

Using val in seenit is a bit faster than .get():

seenit   = dict()
for val in fl:
    if val in seenit :
        seenit[val] += 1
    else:
        seenit[val] = 1

For larger lists, Counter will eventually outperform all other approaches. and defaultdict is going to be faster than using .get() or val in seenit.

Alain T.
  • 40,517
  • 4
  • 31
  • 51