2

I am trying to find the averages for the values of a dictionary by city. For the purposes of this exercise I cannot use numpy or pandas.

Here is some example data:

d = {
('Chicago', 2006): 23.4,
('Chicago', 2007): 73.4,
('Dallas', 2008): 70.8,
('Paris', 2010): 5.6,
('Paris', 2011): 63.3)
}

Here is the ideal output:

city_averages = {
    'Chicago': 48.4,
    'Dallas': 70.8,
    'Paris': 139.7
    }

Here is the code I tried.

city_averages = {}


total = 0
for k,v in d.items():
    total += float(v) 
    city_averages[k[0]] = total 
     
    
mk2080
  • 872
  • 1
  • 8
  • 21
  • You could use `itertools.groupby` from the standard library. – NicholasM Sep 24 '20 at 02:13
  • This [answer](https://stackoverflow.com/a/34140809/7675174) might provide enough for you to develop your own solution. It uses `collections.Counter()`. – import random Sep 24 '20 at 02:18
  • why dont you just create a new dictionary and just add the values from current into the new dict. this will be the simplest way to do as you are new to dicts (i assume you are new to this as you are not allowed to use pandas and numpy) – Joe Ferndz Sep 24 '20 at 02:25

3 Answers3

2

There is very similar question on here

In your case, the code is as following:

from collections import defaultdict
import statistics

d = {
    ('Chicago', 2006): 23.4,
    ('Chicago', 2007): 73.4,
    ('Dallas', 2008): 70.8,
    ('Paris', 2010): 5.6,
    ('Paris', 2011): 63.3
}

grouper = defaultdict(list)

for k, v in d.items():
    grouper[k[0]].append(v)

city_averages = {k: statistics.mean(v) for k,v in grouper.items()}
print(city_averages)
Joona Yoon
  • 314
  • 3
  • 16
2

You can do something more simple like this:

d = {
('Chicago', 2006): 23.4,
('Chicago', 2007): 73.4,
('Dallas', 2008): 70.8,
('Paris', 2010): 5.6,
('Paris', 2011): 63.3,
('Paris', 2011): 100.4
}

dnew = {}
for k,v in d.items():
    if k[0] in dnew:
        dnew[k[0]] += v 
    else:
        dnew[k[0]] = v

print (dnew)

you will get an output as follows:

{'Chicago': 96.80, 'Dallas': 70.8, 'Paris': 169.3}

You will need to format the data before you print them.

I will leave you to figure out the logic for finding the average. This should help you get closer to the full answer.

answer with average calculation:

Here's the code that includes calculation for average. This does not use any complicated logic.

dnew = {}
dcnt = {}

for k,v in d.items():
    dnew[k[0]] = dnew.get(k[0], 0) + v
    dcnt[k[0]] = dcnt.get(k[0], 0) + 1

for k,v in dnew.items():
    dnew[k] /= dcnt[k]

print (dnew)

The output will be as follows:

{'Chicago': 48.400000000000006, 'Dallas': 70.8, 'Paris': 56.43333333333334}
Joe Ferndz
  • 8,417
  • 2
  • 13
  • 33
  • If you are having trouble with the average, let me know. i will provide you additional code. Hint: you can store the sum in one dict and the counter in another. then use both to find the average. – Joe Ferndz Sep 24 '20 at 02:48
  • 1
    instead of having `if k[0] in dnew` with two then/else bodies you can just do one statement `dnew[k[0]] = dnew.get(k[0], 0) + v`. If you like my suggestion, [here's your full shortened code](https://tio.run/##bY5BCoMwEEX3c4rZqTSkUVsVwVV7gO7FRVCrQasSxVKKZ7cJEUqlWWXef/yZ4TXVfedHg1zXAhN8g21dapHzqrcIeowFToyeT09kH4QqCLfgytuWj4ZHmjMaaX7jUhjsMoXPNNhRV9HApz6BBaDoyqc@YYEi7ybzg3svsSEzig4LKqbyMdpODKie1tMmZVmmVD3QqpxsDQgyBw84G011fTU17DX3d4ku@rMnw2OydWUAgxTqQltHDqzrBw "Python 3.8 (pre-release) – Try It Online") – Arty Sep 24 '20 at 04:22
  • 1
    Unlike dict's `d[key]` operator which throws `KeyError` if no key, method `d.new(key, default_value)` always returns something - if key exists it returns value corresponding to key, if not it returns `default_value`, if `default_value` is not given (i.e. `d.get(key)`) then as default value `None` is returned, you may [read here](https://docs.python.org/3/library/stdtypes.html#typesmapping) about `.get(...)`. – Arty Sep 24 '20 at 05:03
0

Next I provide two versions of one-liner codes - first simple one using itertools.groupby and second more complex without usage of any extra modules.

Try it online!

import itertools

d = {
('Chicago', 2006): 23.4,
('Chicago', 2007): 73.4,
('Dallas', 2008): 70.8,
('Paris', 2010): 5.6,
('Paris', 2011): 63.3,
}

print({k : sum(e[1] for e in lg) / len(lg) for k, g in itertools.groupby(sorted(d.items()), lambda e: e[0][0]) for lg in (list(g),)})

Next fancy one-liner code I've created without using any modules (like itertools), just plain python, it is as efficient in terms of time complexity as code above with itertools.groupby. This code is just for recreational purpose or when you really need one-liner without usage of any modules:

Try it online!

d = {
('Chicago', 2006): 23.4,
('Chicago', 2007): 73.4,
('Dallas', 2008): 70.8,
('Paris', 2010): 5.6,
('Paris', 2011): 63.3,
}

print({k : sm / cnt  for sd in (sorted(d.items()),) for i in range(len(sd)) for k, cnt, sm in ((sd[i][0][0] if i + 1 >= len(sd) or sd[i][0][0] != sd[i + 1][0][0] else None,) + ((1, sd[i][1]) if i == 0 or sd[i - 1][0][0] != sd[i][0][0] else (cnt + 1, sm + sd[i][1])),) if k is not None})
Arty
  • 14,883
  • 6
  • 36
  • 69
  • @Suparshva `sorted` can't be removed, because groupby groups only sequential entries, and entries of dict are not sorted by key in general. BTW how you want to use `statistics.mean`? You may post an answer with your solution if it is different from my. – Arty Sep 24 '20 at 03:04