0

I'm trying to find a non-pandas solution to summarising by category.

I have this lookup table as a list of dicts:

lookup_table = [
{"urban_rural": "urban", "technology": "FTTC", "speed": 50},
{"urban_rural": "rural", "technology": "FTTC", "speed": 10},
{"urban_rural": "urban", "technology": "FTTC", "speed": 30}
]

I want to find the mean of 'speed' by category ('urban_rural', and 'technology') so I end up with this:

lookup_table_mean_values = [
{"urban_rural": "urban", "technology": "FTTC", "speed": 40},
{"urban_rural": "rural", "technology": "FTTC", "speed": 10}
]

Edit (add current code):

I didn't want to muddy the water, but as @Patrick Artner has requested, here's where I'm at. Currently this question provides a suggested answer for a dict, providing both simple loop, and Iteritems options, however I've not been able to adapt to the list of dict structure so far.

I would be quite happy using something like this:

lookup_table_mean_values =[float(sum(values)) / len(values) for key, values in lookup_table .iteritems()]
Thirst for Knowledge
  • 1,606
  • 2
  • 26
  • 43
  • .. and where is the code that you wrote to solve this problem of yours? Thats how SO works: You have code, your have a problem, we help fix it. If you have a specific problem, consider studying [how-to-ask](https://stackoverflow.com/help/how-to-ask) and [on topic](https://stackoverflow.com/help/on-topic) , provide code respecting [How to create a Minimal, Complete, and Verifiable example](https://stackoverflow.com/help/mcve) and your exception / expectation that do not get met by your code and I am sure SO will help you out. We are no coding service that delivers according to your specs... – Patrick Artner Apr 23 '18 at 10:32
  • Thanks for you comment Patrick. As a user of SO answers, I was hesitant to provide additional lines when I knew they were incorrect. – Thirst for Knowledge Apr 23 '18 at 11:03
  • The idea by posting what you got is: we see what you tryed, that you tried and where you are stuck. – Patrick Artner Apr 23 '18 at 11:17

2 Answers2

2
lookup_table = [
  {"urban_rural": "urban", "technology": "FTTC", "speed": 50},
  {"urban_rural": "rural", "technology": "FTTC", "speed": 10},
  {"urban_rural": "urban", "technology": "FTTC", "speed": 30}
]

dic = {}
for d in lookup_table:
    key = d['urban_rural'], d['technology']
    if key not in dic: dic[key] = []
    dic[key].append(d['speed'])

mean = [{"urban_rural":key[0], "technology":key[1], "speed":sum(val)/len(val)}
         for key,val in dic.items()]
print(mean)

Output:

[{'urban_rural': 'urban', 'technology': 'FTTC', 'speed': 40.0}, {'urban_rural': 'rural', 'technology': 'FTTC', 'speed': 10.0}]
sciroccorics
  • 2,357
  • 1
  • 8
  • 21
0

As follows:

lookup_table = [
{"urban_rural": "urban", "technology": "FTTC", "speed": 50},
{"urban_rural": "rural", "technology": "FTTC", "speed": 10},
{"urban_rural": "urban", "technology": "FTTC", "speed": 30},
]


def get_mean(dict, by_mean):
    mean = 0
    for j in range(0, len(dict)):
        mean += dict[j][by_mean]

    mean = mean / len(dict)
    return mean


def foo(dict, key, value, by_mean):
    temp1 = []
    temp2 = []
    res = []
    for i in range(0, len(dict)):
        if dict[i][key] == value:
            temp1.append(dict[i])
    else:
        temp2.append(dict[i])

    res.append(temp1[0])
    res.append(temp2[0])
    res[0][by_mean] = get_mean(temp1, by_mean)
    res[1][by_mean] = get_mean(temp2, by_mean)

    return res


print foo(lookup_table, 'urban_rural', 'urban', 'speed')

Out:

[{'speed': 40, 'technology': 'FTTC', 'urban_rural': 'urban'}, {'speed': 10, 'technology': 'FTTC', 'urban_rural': 'rural'}]
Benyamin Jafari
  • 27,880
  • 26
  • 135
  • 150