0

Confused as to how to create this final part of my function.

The function below needs to read in data, start year and end year.

Consider the following:

def summary_statistics(data, year_start, year_end):
    earthquake_count_by_year = []
    total_damages_by_year = []
    casualties_by_year = []
    year_count = {}
    dmg_sum = {}
    casualties = {}
    
    years = []
    year_start = int(year_start)
    year_end = int(year_end)
    
    if year_end >= year_start:
        
        #store year range into list
        years = list(range(year_start, year_end+1))
        
        for index, tuple in enumerate(data):
            #values to be summed for each year in dmg_sum
            yr = tuple[0]
            dmgs = tuple[9]
            deaths = tuple[6]
            missing = tuple[7]
            injured = tuple[8]
            
            #if year in range of years (year_start - year_end)
            if tuple[0] in years:
                #if year does not exist in dict, set it to 0
                if tuple[0] not in year_count:
                    year_count[tuple[0]] = 0
                #count number of occurences for each year
                year_count[tuple[0]] += 1
                
                if tuple[0] not in dmg_sum:
                    dmg_sum[tuple[0]] = dmgs
                else:
                    dmg_sum[tuple[0]] += dmgs
                    
                if tuple[0] not in casualties:
                    casualties[tuple[0]] = list(deaths + ',' + missing + ',' + injured)
                else:
                    casualties[tuple[0]] += list(deaths + ',' + missing + ',' + injured)

        earthquake_count_by_year = list(year_count.items())
        total_damages_by_year = list(dmg_sum.items())
        casualties_by_year = list(casualties.items())

    
    L = [[earthquake_count_by_year], [total_damages_by_year], [casualties_by_year]]
    print(L)
    return L

The part I am having trouble with is:

                if tuple[0] not in casualties:
                    casualties[tuple[0]] = list(deaths + ',' + missing + ',' + injured)
                else:
                    casualties[tuple[0]] += list(deaths + ',' + missing + ',' + injured)

This is the expected output I need for casualties_by_year[]:

[(2020, (deaths, missing, injured)), (2019, (deaths, missing, injured)), (2018, (deaths, missing, injured))]

So what I'm trying to do is construct a list of tuples, in the arrangement shown above, for the final list going into L. it's tuple[0] for year, tuple[6] for deaths, tuple[7] for missing, tuple[8] for injured.

How can I correct that final if/else to get the list of tuples I'm looking for?

List comprehension?

EDIT- This is what "data" looks as it's input into the function:

[(2020, 1, 6.0, 'CHINA:  XINJIANG PROVINCE', 39.831, 77.106, 1, 0, 2, 0), (2020, 1, 6.7, 'TURKEY:  ELAZIG AND MALATYA PROVINCES', 38.39, 39.081, 41, 0, 1600, 0), (2020, 1, 7.7, 'CUBA: GRANMA;  CAYMAN IS;  JAMAICA', 19.44, -78.755, 0, 0, 0, 0), (2020, 2, 6.0, 'TURKEY: VAN;  IRAN', 38.482, 44.367, 10, 0, 60, 0), (2020, 3, 5.4, 'BALKANS NW:  CROATIA:  ZAGREB', 45.897, 15.966, 1, 0, 27, 6000.0)]
boog
  • 472
  • 1
  • 5
  • 23
  • 1
    Do you have a test case that produces the wrong answer? [mcve] – Kenny Ostrom Nov 03 '20 at 22:24
  • 1
    If you just put in the effort to write the data, I can help more, but since you didn't, I'm just going to tell you the short answer here in a comment: You need to make a tuple with normal python syntax: tuple(stuff) ... except you can't because you masked the type name with your own variable named "tuple" Don't do that. Rename your variable. If you give me a data dict with valid syntax I'll fill out the answer a little more. – Kenny Ostrom Nov 03 '20 at 22:37
  • @kennyOstrom my apologies, Can't believe I forgot the data input. The data is input as a list of tuples, it's not a dict. I just updated my question with the input. – boog Nov 04 '20 at 01:44

1 Answers1

2

If you want a tuple, make a tuple. It's that simple. You can still use parentheses notation normally, but sometimes you need the constructor.

But first, never name a variable the same as some important python type. You want to create an instance of type tuple, but you have a variable named tuple, so now "tuple" means that local variable, and you can't access the tuple constructor, because you've hidden its name. I renamed it to "year_data"

Now, when you add the tuples together, that gets slightly more interesting. See Python element-wise tuple operations like sum

The rest is already there. You just have to use it. I deleted some irrelevant stuff.

import operator

data = [
    (2020, 1, 6.0, 'CHINA:  XINJIANG PROVINCE', 39.831, 77.106, 1, 0, 2, 0), 
    (2020, 1, 6.7, 'TURKEY:  ELAZIG AND MALATYA PROVINCES', 38.39, 39.081, 41, 0, 1600, 0), 
    (2020, 1, 7.7, 'CUBA: GRANMA;  CAYMAN IS;  JAMAICA', 19.44, -78.755, 0, 0, 0, 0), 
    (2020, 2, 6.0, 'TURKEY: VAN;  IRAN', 38.482, 44.367, 10, 0, 60, 0), 
    (2020, 3, 5.4, 'BALKANS NW:  CROATIA:  ZAGREB', 45.897, 15.966, 1, 0, 27, 6000.0)
]

def summary_statistics(data):
    casualties = {}
    for index, year_data in enumerate(data):
        year = year_data[0]
        deaths = year_data[6]
        missing = year_data[7]
        injured = year_data[8]
        casualties_data = (deaths, missing, injured)

        if year not in casualties:
            casualties[year] = casualties_data
        else:
            # https://stackoverflow.com/a/497894/1766544
            casualties[year] = tuple(map(operator.add, casualties[year], (deaths, missing, injured)))

    casualties_by_year = list(casualties.items())
    return casualties_by_year
    
result = summary_statistics(data)
print(result)

[(2020, (53, 0, 1689))]

Kenny Ostrom
  • 5,639
  • 2
  • 21
  • 30