10

I wanted to know if the functionality i am trying to implement in python is possible.

I have a global hash called Creatures. Creatures contain sub-hashes called mammals, amphibians, birds, insects.

Mammals have sub-hashes called whales, elephants. Amphibians have sub-hashes called frogs, larvae. Birds have sub-hashes called Eagle, parakeet. Insects have sub-hashes called dragonfly, mosquito.

Again, Eagles have sub-hashes called male, female.

I am counting the frequencies of all these creatures from a text file. For example, if the file is in below format:

Birds   Eagle  Female
Mammals whales Male
Birds   Eagle  Female

I should output Creatures[Birds[Eagle[Female]]] = 2
                Creatures[mammals[Whales[Male]]] = 1  

Is it possible in Python? How can it be done? I am very new to Python and please help is much appreciated. I am comfortable with dictionaries only upto 1 level, i.e. key-> value. But here, there are multiple keys and multiple values. i am not sure how to proceed with this. I am using python 2.6. Thanks in advace!

J0HN
  • 26,063
  • 5
  • 54
  • 85
Justin Carrey
  • 3,563
  • 8
  • 32
  • 45

3 Answers3

30

The value assigned to a key in a dictionary can itself be another dictionary

creatures = dict()
creatures['birds'] = dict()
creatures['birds']['eagle'] = dict()
creatures['birds']['eagle']['female'] = 0
creatures['birds']['eagle']['female'] += 1

You need to explicitly create each dictionary, though. Unlike Perl, Python does not automatically create a dictionary when you attempt to treat the value of an unassigned key as such.

Unless, of course, you use a defaultdict:

from collections import defaultdict
creatures = defaultdict( lambda: defaultdict(lambda: defaultdict( int )))
creatures['birds']['eagle']['female'] += 1

For arbitrary levels of nesting, you can use this recursive definition

dd = defaultdict( lambda: dd )
creatures = dd
creatures['birds']['eagle']['female'] = 0

In this case, you do need to explicitly initialize the integer value, since otherwise the value of creatures['birds']['eagle']['female'] will be assumed to be another defaultdict:

>>> creatures = dd
>>> type(creatures['birds']['eagle']['female'])
<class 'collections.defaultdict'>
Gordon Fogus
  • 298
  • 2
  • 7
chepner
  • 497,756
  • 71
  • 530
  • 681
  • birds, animals, etc.. are just examples, not actual entries. Actually, i need to read from file and add them automatically – Justin Carrey Jun 17 '13 at 18:50
2

If you just have to "count" things -- and assuming the data file contains all the required level of "hashes" -- that will do the trick:

import collections

result = collections.defaultdict(int)

with open("beast","rt") as f:
    for line in f:
        hashes = line.split()
        key = '-'.join(hashes)
        result[key] += 1

print result

Producing the result:
defaultdict(<type 'int'>, {'Mammals-whales-Male': 1, 'Birds-Eagle-Female': 2})

If you require nested dictionary -- post-processing of that result is still possible...

Sylvain Leroux
  • 50,096
  • 7
  • 103
  • 125
1

Not elegant, but working:

result = {}
for line in input_file.split("\n"):
    curdict = result
    values = line.split(" ")
    for item in values[:-1]:
        if item not in curdict:
            curdict[item] = {}
        curdict = curdict[item]
    last_item = values[-1]
    if last_item not in curdict:
        curdict[last_item] = 0
    curdict[last_item] += 1

This probably can be written in a cleaner way, but at least it works and allows for arbitrary nesting level, unless you have different nesting level for the same "entity" (e.g. Birds Eagle Female and Birds Eagle won't work)

J0HN
  • 26,063
  • 5
  • 54
  • 85