So I found most of the solution to my problem in this thread: Use Python to select rows with a particular range of values in one column
But when implementing the code, I'm coming up with an error that I cannot figure out. I'm trying to extract the rows of data for subscribers only from citi bike data (info here: http://www.citibikenyc.com/system-data)
So here is the code:
import csv
with open("E:/Dropbox/PPS/CitiBikeData/2014_Data.csv") as input, open("E:/Dropbox/PPS/CitiBikeData/subscribers.csv", "w") as output:
reader = csv.DictReader(input, dialect="excel-tab")
fieldnames = reader.fieldnames
writer_output = csv.DictWriter(output, fieldnames, dialect="excel-tab")
writer_output.writeheader()
for row in reader:
if int(row['gender']) > 0:
writer_output.writerow(row)
And here is the error I'm getting:
C:\Python34\python.exe E:/Dropbox/PPS/CitiBikeData/csvfilter_2.py
Traceback (most recent call last):
File "E:/Dropbox/PPS/CitiBikeData/csvfilter_2.py", line 9, in <module>
if int(row['gender']) > 0:
KeyError: 'gender'
Process finished with exit code 1
I understand what a KeyError is (from reading this https://wiki.python.org/moin/KeyError), but I can't figure out why I'm getting the error, or how to fix it.