0

I want to convert county state name to coordinates.

county:

fips    state_fips  county_fips state   county
1000    1   0   Alabama Alabama
1005    1   5   Alabama Barbour County
1007    1   7   Alabama Bibb County
1009    1   9   Alabama Blount County
1011    1   11  Alabama Bullock County
6085    6   85  California  Santa Clara County
6089    6   89  California  Shasta County
6091    6   91  California  Sierra County
32021   32  21  Nevada  Mineral County
32023   32  23  Nevada  Nye County
32027   32  27  Nevada  Pershing County
32029   32  29  Nevada  Storey County

I want to use python geopy package.

from geopy.geocoders import Nominatim
import pandas as pd
import numpy as np
import time

geolocator = Nominatim(timeout=None)
fobj_out = open('county_coordinate.txt', 'a')
i=0
for row in county.itertuples(index=True, name='Pandas'):
    location = geolocator.geocode(getattr(row, "county"))
    #print(location.address)
    #print((location.latitude, location.longitude))
    cty=getattr(row, "county")
    #lat = location.latitude
    #long = location.longitude
    fobj_out.write(str(i))
    i=i+1
    fobj_out.write(",")
    fobj_out.write(cty)
    fobj_out.write(",")
    fobj_out.write(str(location.latitude))
    fobj_out.write(",")
    fobj_out.write(str(location.longitude))
    fobj_out.write("\n")
    time.sleep(0.5) # delay 5 milli-seconds between each request
fobj_out.close()

I got the error:

TimeoutError: [WinError 10060] A connection attempt failed because the connected party did not properly respond after a period of time, or established connection failed because connected host has failed to respond
urllib.error.URLError: <urlopen error [WinError 10060] A connection attempt failed because the connected party did not properly respond after a period of time, or established connection failed because connected host has failed to respond>
geopy.exc.GeocoderServiceError: [WinError 10060] A connection attempt failed because the connected party did not properly respond after a period of time, or established connection failed because connected host has failed to respond

First, I hope the time.sleep could help me overcome the problem of many requests in one second. However, the program failed after doing 90 requests. Howe to make the program working to the end of my dataset of 4000 observations?

Second, when I close the output file to put the cashed results into it, I saw that the itertuple doesn't work by the order. I expect that the program should do one by one, and hence, even the program stops, I could continue the unfinished part starting from i+1. However, the itertuple jumps to the middle of my dataset in the middle, leaving the index i having no meaning at all. Howe to make the itertuple follow the order of dataframe index?

Thanks.

zilong
  • 65
  • 7
  • 1
    Isn't this a duplicate of your own question? https://stackoverflow.com/questions/51393049/convert-county-state-names-to-coordinates If so, consider deleting this one and editing the first one instead. – KostyaEsmukov Jul 19 '18 at 08:49
  • True. The two ways do not work well because they all stop after 80 requests and couldn't complete. – zilong Jul 20 '18 at 08:08

0 Answers0