2

I'm quite new to Python and overall programming so bear with me.

What I have is a dictionary of ['Male', 'Female', 'Eunuch'] as values and different names for these as keys:

Persons = { 'Hodor' : 'Male', 'Tyrion': 'Male', 'Theon': 'Male', 'Arya': 'Female', 'Daenerys': 'Female', 'Sansa': 'Female', 'Varys': 'Eunuch}

I want to order them into { Gender: {Names: Counts}}

As such:

input: lst = ['Hodor', 'Hodor', 'Tyrion', 'Tyrion', 'Tyrion', 'Arya', 'Daenerys', 'Daenerys', 'Varys']

output: {'Male': {'Hodor': 2, 'Tyrion': 3, Theon: 0}, 'Female': {'Arya': 1, 'Daenerys': 2, 'Sansa': 0}, 'Eunuch': {'Varys': 1}}

The first thing I have tried is making a code for counting:

counts = {}
for key in Persons: 
    counts[key] = 0 

for key in lst:
    counts[key] += 1

My dictionary D now contains the counts, but how do I compute them all together?

Names = Persons.keys() Gender = Persons.values() Counts = counts.values() Names = counts.keys()

If varys isn't mentioned the gender 'Eunuch' shouldn't be in the output. I've tried different things, but when I try to connect them. I try to switch keys with values, but then only one name comes up.

Hope it makes sense of what I want to do :)

Edit: If Sansa isn't mention and other females are her value should be 0. And I want to be able to manipulate the numbers. Say at what percentage is Hodor mentioned compared to all the males.

2 Answers2

3

Consider

from collections import Counter

cnt = Counter(lst)
print {gender: {name: cnt[name] for name in persons if persons[name] == gender}
          for gender in set(persons.values())}

# {'Eunuch': {'Varys': 1}, 
# 'Male': {'Tyrion': 3, 'Hodor': 2, 'Theon': 0}, 
# 'Female': {'Daenerys': 2, 'Arya': 1, 'Sansa': 0}}

To calculate percentages let's add a helper function:

def percentage_dict(d):
    s = float(sum(d.values()))
    return {k: 100 * v / s for k, v in d.items()}

and then

print {gender: percentage_dict({name: cnt[name] for name in persons if persons[name] == gender})
       for gender in set(persons.values())}

# {'Eunuch': {'Varys': 100.0}, 'Male': {'Hodor': 40.0, 'Theon': 0.0, 'Tyrion': 60.0}, 'Female': {'Daenerys': 66.66666666666667, 'Arya': 33.333333333333336, 'Sansa': 0.0}}

And this is how to write this comprehension more efficiently using a helper function:

def invert(d):
    """Turn {a:x, b:x} into {x:[a,b]}"""
    r = {}
    for k, v in d.items():
        r.setdefault(v, []).append(k)
    return r

and then

cnt = Counter(lst)
print {gender: {name: cnt[name] for name in names}
       for gender, names in invert(persons).items()}

To exclude subdicts that sum up to zero

print {gender: {name: cnt[name] for name in names}
    for gender, names in invert(persons).items()
    if any(cnt[name] for name in names)
}
georg
  • 211,518
  • 52
  • 313
  • 390
  • +1 because it is short and for a nested one-line-dict-comrehension it is almost well readable – koffein Dec 04 '13 at 23:28
  • That is so beautiful. Is there any easy way in this code to manipulate Counter if you want to know what percentage is 'Hodor' mentioned compared to all the males. so the count of name is float percentage of the gender. – donstone666 Dec 04 '13 at 23:36
  • @thg435 Thanks a bunch. I'm so amazed by the responsiveness on this site. I hope in the future, that I can be just as helpful as you've been! – donstone666 Dec 04 '13 at 23:48
  • @donstone666: You're welcome... and welcome to SO! ;) – georg Dec 04 '13 at 23:52
  • @thg435 I've noticed a little thing. When I delete Varys from the lst. The gender Eunuch pops up anyway, and that seems unneccesary. How to deal with this? Edit: Because when there are no people of that gender mentioned that nice `percentage_dict(d)` is gonna divide by zero :) – donstone666 Dec 05 '13 at 00:11
1
from collections import defaultdict

gender = {
    'Hodor' : 'Male',
    'Tyrion': 'Male',
    'Theon': 'Male',
    'Arya': 'Female',
    'Daenerys': 'Female',
    'Sansa': 'Female',
    'Varys': 'Eunuch'
}

lst = ['Hodor', 'Hodor', 'Tyrion', 'Tyrion', 'Tyrion', 'Arya', 'Daenerys', 'Daenerys', 'Varys']

result = defaultdict(lambda: defaultdict(int))
for name in lst:
    result[gender[name]][name] += 1
Hugh Bothwell
  • 55,315
  • 8
  • 84
  • 99