I am using geopy to get latitude - longitude pairs for city names. For single queries, this works fine. What I try to do now is iterating through a big list of city names (46.000) and getting geocodes for each city. Afterwards, I run them through a check loop which sorts the city (if it is in the US) in the correct state. My problem is, that I get "GeocoderTimedOut('Service timed out')" all the time, everything is pretty slow and I'm not sure if that is my fault or just geopys nature. Here is the responsible code snippet:
for tweetcount in range(number_of_tweets):
#Get the city name from the tweet
city = data_dict[0]['tweetList'][tweetcount]['user']['location']
#Sort out useless tweets
if(len(city)>3 and not(city is None)):
# THE RESPONSIBLE LINE, here the error occurs
location = geolocator.geocode(city);
# Here the sorting into the state takes place
if location is not None:
for statecount in range(len(data)):
if point_in_poly(location.longitude, location.latitude, data[statecount]['geometry']):
state_tweets[statecount] += 1;
break;
Somehow, this one line throws timeouts at every 2./3. call. City has the form of "Manchester", "New York, New York" or something similar. I already had try - except blocks around everything, but that doesn't really change anything about the problem, so I removed them for now... Any ideas would be great!