I am working on a coding project to determine if waters are polluted or not. For one type of pollution, a water is considered polluted if greater than 10% of samples in a 5 year window are outside of given criteria. To address this, I have made the following code
def testLocationForConv(overDict):
impairedList=[]
for pollutant in overDict:
for date in dateList:
total=0
over=0
for compDate in dateList:
if int(date[0])+1825>int(compDate[0]) and int(date[0])-1825<int(compDate[0]):
total=total+1
if date[1]:
over=over+1
if total!=0:
if over/total>=.1:
if pollutant not in impairedList:
impairedList.append(pollutant)
return impairedList
The code takes a dictionary, and will produce a list of pollutants for a water body. The keys of the dictionary are strings with the names of pollutants, and the value is dateList, a list of tuples, with the date of a test as the first item and the second is a boolean that indicates if the value measured on that day is over or under the acceptable value
Here is an example "overDict" that the code would take as an input:
{'Escherichia coli': [('40283', False), ('40317', False), ('40350', False), ('40374', False), ('40408', True), ('40437', True), ('40465', False), ('40505', False), ('40521', False), ('40569', False), ('40597', False), ('40619', False), ('40647', False), ('40681', False), ('40710', False), ('40738', False), ('40772', False), ('40801', True), ('40822', False), ('40980', False), ('41011', False), ('41045', False), ('41067', False), ('41228', False), ('41388', False), ('41409', False), ('41438', False), ('41466', False), ('41557', False), ('41592', False), ('41710', False), ('41743', False), ('41773', False), ('41802', False), ('41834', False)]}
For this example, the code says it is an excedance but it should not be, since less than 10% of the tests were "True" and all tests were taken in a 5 year time period. What is incorrect here?
Update: When I use this dictionary as the overDict, the code thinks this data is not an exceedence, even though in the window that starts 40745 2 out of 11 values are over the limit
{'copper': [('38834', False), ('38867', False), ('38897', False),
('40745', False), ('40764', False), ('40799', False), ('41024', True),
('41047', False), ('41072', True), ('41200', False), ('41411', False),
('41442', False), ('41477', False), ('41502', False)]}
To troubleshoot, I printed sliding_windows under the "for tuple" and "for window" lines of code, and I got this instead of a list where each different start date is used once.
[[38834, 0, 1]]
[[38834, 0, 1]]
[[38834, 0, 1]]
[[38834, 0, 1]]
[[38834, 0, 1]]
[[38834, 0, 1]]
[[38834, 0, 1]]