I want to convert county state name to coordinates.
county:
fips state_fips county_fips state county
1000 1 0 Alabama Alabama
1005 1 5 Alabama Barbour County
1007 1 7 Alabama Bibb County
1009 1 9 Alabama Blount County
1011 1 11 Alabama Bullock County
6085 6 85 California Santa Clara County
6089 6 89 California Shasta County
6091 6 91 California Sierra County
32021 32 21 Nevada Mineral County
32023 32 23 Nevada Nye County
32027 32 27 Nevada Pershing County
32029 32 29 Nevada Storey County
I want to use python geopy package.
from geopy.geocoders import Nominatim
import pandas as pd
import numpy as np
import time
geolocator = Nominatim(timeout=None)
fobj_out = open('county_coordinate.txt', 'a')
i=0
for row in county.itertuples(index=True, name='Pandas'):
location = geolocator.geocode(getattr(row, "county"))
#print(location.address)
#print((location.latitude, location.longitude))
cty=getattr(row, "county")
#lat = location.latitude
#long = location.longitude
fobj_out.write(str(i))
i=i+1
fobj_out.write(",")
fobj_out.write(cty)
fobj_out.write(",")
fobj_out.write(str(location.latitude))
fobj_out.write(",")
fobj_out.write(str(location.longitude))
fobj_out.write("\n")
time.sleep(0.5) # delay 5 milli-seconds between each request
fobj_out.close()
I got the error:
TimeoutError: [WinError 10060] A connection attempt failed because the connected party did not properly respond after a period of time, or established connection failed because connected host has failed to respond
urllib.error.URLError: <urlopen error [WinError 10060] A connection attempt failed because the connected party did not properly respond after a period of time, or established connection failed because connected host has failed to respond>
geopy.exc.GeocoderServiceError: [WinError 10060] A connection attempt failed because the connected party did not properly respond after a period of time, or established connection failed because connected host has failed to respond
First, I hope the time.sleep could help me overcome the problem of many requests in one second. However, the program failed after doing 90 requests. Howe to make the program working to the end of my dataset of 4000 observations?
Second, when I close the output file to put the cashed results into it, I saw that the itertuple doesn't work by the order. I expect that the program should do one by one, and hence, even the program stops, I could continue the unfinished part starting from i+1. However, the itertuple jumps to the middle of my dataset in the middle, leaving the index i having no meaning at all. Howe to make the itertuple follow the order of dataframe index?
Thanks.