My program needs a function that reads data from a csv file ("all.csv") and extracts all the data pertaining to a state on a specific date (extract each row that has 'state name' and 'date' in it), then writes the extracted data to another csv file named: state + ".csv"
While the data is being written, the number of cases and deaths for each state on that specific date is counted and totaled. Then the function returns total cases and deaths as a tuple (cases,deaths)
ex. state = 'California' date = '2020-03-09'
The error I get is that '0.0' and 'deaths' cannot be converted to an int. The first row is the header, and I get the error that 'deaths cannot be converted to an int. So I have two questions:
- How can I skip the header 'deaths' (last column) and move on to the the rest of the data?
- How can I convert the rest of the data (a string in decimal format) to an int?
Note: When I saved the link data to 'all.csv' the deaths column converted to decimal format (0.0).
Here is the contents of 'all.csv': https://raw.githubusercontent.com/nytimes/covid-19-data/master/us-counties.csv
This is a snippet of 'all.csv':
**note that there are 7 columns in 'all.csv' as opposed to 6 columns in the csv file hyperlink
Here is the program I have written:
import csv
input_file = 'all.csv'
state = input()
date = input() # date format m/d/yyyy
output_file = state + '.csv'
def number_of_cases_deaths_by_date(input_file, output_file, state, date):
with open(input_file, 'r') as infile: #open both files
contents = infile.readlines()
with open(output_file, 'w') as outfile:
writer = csv.writer(outfile)
for row in range(len(contents)): # save data in list
contents[row] = contents[row].split(',') #split elements
contents[row][6] = contents[row][6].strip('\n') #strip \n from last column
print(contents[3:5])
cases = 0
deaths = 0
for row in range(len(contents)):
if contents[row][3] == state and contents[row][1] == date: # if row has desired state, write it to new file
writer.writerow((contents[row]))
int_cases = int(contents[row][5])
cases = cases + int_cases
int_deaths = int(contents[row][6])
deaths += deaths + int_deaths
return (cases, deaths)
data = number_of_cases_deaths_by_date(input_file, output_file, state, date)
print(data)