1

I am a Python 3 beginner and I am trying to practice some examples. I have dictionary data (year, month) formatted as follows:

year month yearmonth x1 x2 x3 ...
1999  1     199901   10 20 30 ...
1999  3     199903   10 20 30 ...
2000  4     200004   10 20 30 ...
2000  9     200009   10 20 30 ...
2000  10    200010   10 20 30 ...
.................................
.................................

I would like to get subtotals for each year for certain keys, e.g. subtotal for only variable x2. The expected result is to return the following:

year  totalx2 
1999     40
2000     60
.........

Of course, in my data there are more years and months than presented here. If there is a missing month for a certain year, I will assume that the value is 0 when adding the 12 months subtotal.

Any help would be great! Thanks for your patient with a beginner. Joe

apogalacticon
  • 709
  • 1
  • 9
  • 19
Joe
  • 11
  • 1

2 Answers2

0

I'd suggest using the pandas library to do this. A similar answer was stated here:

Pandas sum by groupby, but exclude certain columns

eatmeimadanish
  • 3,809
  • 1
  • 14
  • 20
0

you could try grouping values by year in new dict, that would make them easy to manipulate with, and then do your thing. it would look something like this

from collections import defaultdict

def group_by_year(list, key): total = "total_" + key new_dict = defaultdict(lambda: 0) for row in list: year = row["year"] if not new_dict[year]: new_dict[year] = {"year": year, total: row[key]} else: new_dict[year][total] += row[key] return [new_dict[key] for key in new_dict]

Ostoja
  • 119
  • 8