I am a relatively new Python user and am attempting to use a function to return the latitude and longitude for a city and country using the "geopy" module. I have had errors because my city was misspelled which I have managed to catch. The trouble I am now having is that I am encountering a timeout error. I have read this question Geopy: catch timeout error and adjusted my timeout parameter accordingly. However it now runs for varying lengths of time before I get a timeout error. I have tried running it over faster networks and it works to some degree. The trouble is that I need to do this for 100k rows and the maximum rows it has iterated before timing out is 20k. Any help/advice on how to solve this problem is greatly appreciated.
import os
from geopy.geocoders import Nominatim
os.getcwd() #check current working directory
os.chdir("C:\Users\Philip\Documents\HDSDA1\Project\Global Terrorism Database")
#import file as a csv
import csv
gtd=open("gtd_original.csv","r")
csv_f=csv.reader(gtd)
outf=open("r_ready.csv","wb")
writer=csv.writer(outf,dialect='excel')
for row in csv_f:
if row[13] in ("","NA") or row[14] in ("","NA"):
lookup = row[12] + "," + row[8] # creates a city,country
geolocator = Nominatim()
location = geolocator.geocode(lookup, timeout = None) #looks up the city/country on maps
try:
location.latitude
except:
lookup = row[8]
location = geolocator.geocode(lookup)
row[13] = location.latitude
row[14] = location.longitude
writer.writerow(row)
gtd.close()
outf.close()