4

I get errors for some addresses when geocoding with geopy (using Nominatim). I don't really see a pattern why an address gives an error and another does not, e.g. simply changing the house number can make the difference.

When I make the API request mentioned by the error message via urllib3 it works, so I suppose the error is caused by geopy, but I am not sure.

Minimal reproducible example

from geopy.geocoders import Nominatim
geolocator = Nominatim(user_agent="my-test-app")
geolocator.geocode({'country': 'DE', 'city': 'Erlangen', 'postalcode': '91052',
                                'street': 'Nürnberger Straße 6'}) # working

>>> Location(Nürnberger Straße, Sebaldussiedlung, Erlangen, Bayern, 91052, Deutschland, (49.5772384, 11.015895, 0.0))

geolocator.geocode({'country': 'DE', 'city': 'Erlangen', 'postalcode': '91052',
                                'street': 'Nürnberger Straße 7'}) # error

Error message

---------------------------------------------------------------------------
timeout                                   Traceback (most recent call last)
C:\Users\USERNAME\Anaconda3\envs\crm_templates\lib\site-packages\urllib3\connectionpool.py in _make_request(self, conn, method, url, timeout, chunked, **httplib_request_kw)
    425                     # Otherwise it looks like a bug in the code.
--> 426                     six.raise_from(e, None)
    427         except (SocketTimeout, BaseSSLError, SocketError) as e:

C:\Users\USERNAME\Anaconda3\envs\crm_templates\lib\site-packages\urllib3\packages\six.py in raise_from(value, from_value)

C:\Users\USERNAME\Anaconda3\envs\crm_templates\lib\site-packages\urllib3\connectionpool.py in _make_request(self, conn, method, url, timeout, chunked, **httplib_request_kw)
    420                 try:
--> 421                     httplib_response = conn.getresponse()
    422                 except BaseException as e:

C:\Users\USERNAME\Anaconda3\envs\crm_templates\lib\http\client.py in getresponse(self)
   1353             try:
-> 1354                 response.begin()
   1355             except ConnectionError:

C:\Users\USERNAME\Anaconda3\envs\crm_templates\lib\http\client.py in begin(self)
    305         while True:
--> 306             version, status, reason = self._read_status()
    307             if status != CONTINUE:

C:\Users\USERNAME\Anaconda3\envs\crm_templates\lib\http\client.py in _read_status(self)
    266     def _read_status(self):
--> 267         line = str(self.fp.readline(_MAXLINE + 1), "iso-8859-1")
    268         if len(line) > _MAXLINE:

C:\Users\USERNAME\Anaconda3\envs\crm_templates\lib\socket.py in readinto(self, b)
    588             try:
--> 589                 return self._sock.recv_into(b)
    590             except timeout:

C:\Users\USERNAME\Anaconda3\envs\crm_templates\lib\ssl.py in recv_into(self, buffer, nbytes, flags)
   1070                   self.__class__)
-> 1071             return self.read(nbytes, buffer)
   1072         else:

C:\Users\USERNAME\Anaconda3\envs\crm_templates\lib\ssl.py in read(self, len, buffer)
    928             if buffer is not None:
--> 929                 return self._sslobj.read(len, buffer)
    930             else:

timeout: The read operation timed out

During handling of the above exception, another exception occurred:

ReadTimeoutError                          Traceback (most recent call last)
C:\Users\USERNAME\Anaconda3\envs\crm_templates\lib\site-packages\urllib3\connectionpool.py in urlopen(self, method, url, body, headers, retries, redirect, assert_same_host, timeout, pool_timeout, release_conn, chunked, body_pos, **response_kw)
    676                 headers=headers,
--> 677                 chunked=chunked,
    678             )

C:\Users\USERNAME\Anaconda3\envs\crm_templates\lib\site-packages\urllib3\connectionpool.py in _make_request(self, conn, method, url, timeout, chunked, **httplib_request_kw)
    427         except (SocketTimeout, BaseSSLError, SocketError) as e:
--> 428             self._raise_timeout(err=e, url=url, timeout_value=read_timeout)
    429             raise

C:\Users\USERNAME\Anaconda3\envs\crm_templates\lib\site-packages\urllib3\connectionpool.py in _raise_timeout(self, err, url, timeout_value)
    335             raise ReadTimeoutError(
--> 336                 self, url, "Read timed out. (read timeout=%s)" % timeout_value
    337             )

ReadTimeoutError: HTTPSConnectionPool(host='nominatim.openstreetmap.org', port=443): Read timed out. (read timeout=1)

During handling of the above exception, another exception occurred:

MaxRetryError                             Traceback (most recent call last)
C:\Users\USERNAME\Anaconda3\envs\crm_templates\lib\site-packages\requests\adapters.py in send(self, request, stream, timeout, verify, cert, proxies)
    448                     retries=self.max_retries,
--> 449                     timeout=timeout
    450                 )

C:\Users\USERNAME\Anaconda3\envs\crm_templates\lib\site-packages\urllib3\connectionpool.py in urlopen(self, method, url, body, headers, retries, redirect, assert_same_host, timeout, pool_timeout, release_conn, chunked, body_pos, **response_kw)
    766                 body_pos=body_pos,
--> 767                 **response_kw
    768             )

C:\Users\USERNAME\Anaconda3\envs\crm_templates\lib\site-packages\urllib3\connectionpool.py in urlopen(self, method, url, body, headers, retries, redirect, assert_same_host, timeout, pool_timeout, release_conn, chunked, body_pos, **response_kw)
    766                 body_pos=body_pos,
--> 767                 **response_kw
    768             )

C:\Users\USERNAME\Anaconda3\envs\crm_templates\lib\site-packages\urllib3\connectionpool.py in urlopen(self, method, url, body, headers, retries, redirect, assert_same_host, timeout, pool_timeout, release_conn, chunked, body_pos, **response_kw)
    726             retries = retries.increment(
--> 727                 method, url, error=e, _pool=self, _stacktrace=sys.exc_info()[2]
    728             )

C:\Users\USERNAME\Anaconda3\envs\crm_templates\lib\site-packages\urllib3\util\retry.py in increment(self, method, url, response, error, _pool, _stacktrace)
    445         if new_retry.is_exhausted():
--> 446             raise MaxRetryError(_pool, url, error or ResponseError(cause))
    447 

MaxRetryError: HTTPSConnectionPool(host='nominatim.openstreetmap.org', port=443): Max retries exceeded with url: /search?country=DE&city=Erlangen&postalcode=91052&street=N%C3%BCrnberger+Stra%C3%9Fe+7&format=json&limit=1 (Caused by ReadTimeoutError("HTTPSConnectionPool(host='nominatim.openstreetmap.org', port=443): Read timed out. (read timeout=1)"))

During handling of the above exception, another exception occurred:

ConnectionError                           Traceback (most recent call last)
C:\Users\USERNAME\Anaconda3\envs\crm_templates\lib\site-packages\geopy\adapters.py in _request(self, url, timeout, headers)
    382         try:
--> 383             resp = self.session.get(url, timeout=timeout, headers=headers)
    384         except Exception as error:

C:\Users\USERNAME\Anaconda3\envs\crm_templates\lib\site-packages\requests\sessions.py in get(self, url, **kwargs)
    554         kwargs.setdefault('allow_redirects', True)
--> 555         return self.request('GET', url, **kwargs)
    556 

C:\Users\USERNAME\Anaconda3\envs\crm_templates\lib\site-packages\requests\sessions.py in request(self, method, url, params, data, headers, cookies, files, auth, timeout, allow_redirects, proxies, hooks, stream, verify, cert, json)
    541         send_kwargs.update(settings)
--> 542         resp = self.send(prep, **send_kwargs)
    543 

C:\Users\USERNAME\Anaconda3\envs\crm_templates\lib\site-packages\requests\sessions.py in send(self, request, **kwargs)
    654         # Send the request
--> 655         r = adapter.send(request, **kwargs)
    656 

C:\Users\USERNAME\Anaconda3\envs\crm_templates\lib\site-packages\requests\adapters.py in send(self, request, stream, timeout, verify, cert, proxies)
    515 
--> 516             raise ConnectionError(e, request=request)
    517 

ConnectionError: HTTPSConnectionPool(host='nominatim.openstreetmap.org', port=443): Max retries exceeded with url: /search?country=DE&city=Erlangen&postalcode=91052&street=N%C3%BCrnberger+Stra%C3%9Fe+7&format=json&limit=1 (Caused by ReadTimeoutError("HTTPSConnectionPool(host='nominatim.openstreetmap.org', port=443): Read timed out. (read timeout=1)"))

During handling of the above exception, another exception occurred:

GeocoderUnavailable                       Traceback (most recent call last)
<ipython-input-4-aa66519ee9b9> in <module>()
----> 1 geolocator.geocode({'country': 'DE', 'city': 'Erlangen', 'postalcode': '91052', 'street': 'Nürnberger Straße 7'})

C:\Users\USERNAME\Anaconda3\envs\crm_templates\lib\site-packages\geopy\geocoders\nominatim.py in geocode(self, query, exactly_one, timeout, limit, addressdetails, language, geometry, extratags, country_codes, viewbox, bounded, featuretype, namedetails)
    292         logger.debug("%s.geocode: %s", self.__class__.__name__, url)
    293         callback = partial(self._parse_json, exactly_one=exactly_one)
--> 294         return self._call_geocoder(url, callback, timeout=timeout)
    295 
    296     def reverse(

C:\Users\USERNAME\Anaconda3\envs\crm_templates\lib\site-packages\geopy\geocoders\base.py in _call_geocoder(self, url, callback, timeout, is_json, headers)
    358         try:
    359             if is_json:
--> 360                 result = self.adapter.get_json(url, timeout=timeout, headers=req_headers)
    361             else:
    362                 result = self.adapter.get_text(url, timeout=timeout, headers=req_headers)

C:\Users\USERNAME\Anaconda3\envs\crm_templates\lib\site-packages\geopy\adapters.py in get_json(self, url, timeout, headers)
    371 
    372     def get_json(self, url, *, timeout, headers):
--> 373         resp = self._request(url, timeout=timeout, headers=headers)
    374         try:
    375             return resp.json()

C:\Users\USERNAME\Anaconda3\envs\crm_templates\lib\site-packages\geopy\adapters.py in _request(self, url, timeout, headers)
    393                     raise GeocoderServiceError(message)
    394                 else:
--> 395                     raise GeocoderUnavailable(message)
    396             elif isinstance(error, requests.Timeout):
    397                 raise GeocoderTimedOut("Service timed out")

GeocoderUnavailable: HTTPSConnectionPool(host='nominatim.openstreetmap.org', port=443): Max retries exceeded with url: /search?country=DE&city=Erlangen&postalcode=91052&street=N%C3%BCrnberger+Stra%C3%9Fe+7&format=json&limit=1 (Caused by ReadTimeoutError("HTTPSConnectionPool(host='nominatim.openstreetmap.org', port=443): Read timed out. (read timeout=1)"))

Working example using urllib3

import urllib3

http = urllib3.PoolManager(1, headers={'user-agent': 'my-test-app'})

url = 'https://nominatim.openstreetmap.org/search?country=DE&city=Erlangen&postalcode=91052&street=N%C3%BCrnberger+Stra%C3%9Fe+7&format=json&limit=1'

resp = http.request('GET', url)

json.loads(resp.data.decode())

  
>>> [{'place_id': 17025708,
>>>   'licence': 'Data © OpenStreetMap contributors, ODbL 1.0. https://osm.org/copyright',
>>>   'osm_type': 'node',
>>>   'osm_id': 1641967158,
>>>   'boundingbox': ['49.5924431', '49.5925431', '11.0043901', '11.0044901'],
>>>   'lat': '49.5924931',
>>>   'lon': '11.0044401',
>>>   'display_name': 'Postbank, 7, Nürnberger Straße, Am Anger, Erlangen, Bayern, 91052, Deutschland',
>>>   'class': 'amenity',
>>>   'type': 'bank',
>>>   'importance': 0.6309999999999999,
>>>   'icon': 'https://nominatim.openstreetmap.org/ui/mapicons//money_bank2.p.20.png'}]
BodoB
  • 414
  • 1
  • 3
  • 7

2 Answers2

3

Seems these services are hosted on Servers that are donated. So, Nominatim suggests avoiding extensive use. I can think only that as the reason for the failure - because previously I was extracting for few hundreds and later started doing it for 1000s of locations. Nominatim Usage Policy

I resorted to an alternative solution for handling this error. Addressing this error was getting very difficult. I am using Requests in an alternative solution. In my script, I am reading the Lattitude and Longitude from Excel and firing the HTTP request for "reverse" rest API. Was able to fire 1000 requests in every iterations.Information extracted is printed into Excel output file. Here is my code snippet -

import json

from geopy import Nominatim
from openpyxl import load_workbook
import time
from geopy.extra.rate_limiter import RateLimiter
import urllib3
import requests
import xml.etree.ElementTree as ET

filepath = "/xxx/100KV-BusLoc.xlsx"
wb = load_workbook(filepath)
sheet = wb["Envision"]
#wb["Elec_Sub"]
cell = sheet.cell(2,20)
for i1 in range(5999, 7000):
if (str(sheet.cell(row=i1, column=1).value) != "None"):
    print(i1)
    cell = sheet.cell(i1, 20)
    # print("All Sub - 345")
    latt = str(sheet.cell(row=i1, column=8).value)
    long = str(sheet.cell(row=i1, column=9).value)
    url = 'https://nominatim.openstreetmap.org/reverse?lat=' + latt + '&lon=' + long
    resp = requests.request("Get",url)
    if (resp.status_code != 404) :
        strRes = resp.text
        root = ET.fromstring(strRes)
        for child in root.findall('addressparts'):
            locDet = ""
            if child.find('road') != None:
                locDet = child.find('road').text
            if child.find('municipality') != None:
                locDet = locDet + '|' + child.find('municipality').text
            if child.find('county') != None :
                locDet = locDet + '|' + child.find('county').text
            if child.find('state') != None:
                locDet = locDet + '|' + child.find('state').text
            if child.find('postcode') != None :
                locDet = locDet + '|' + child.find('postcode').text
            if child.find('country') != None:
                locDet = locDet + '|' + child.find('country').text
            if locDet != "":
                print(locDet)
                cell.value = locDet
                wb.save(filepath)
Ashwin Kumar
  • 109
  • 13
  • All code, data, error messages, etc must be embedded as text. Text in images is not copy-pasteable or searchable. It is also difficult to read, especially on mobile devices. Please edit your answer to avoid removal by SO. Also, quality answers generally include an explanation to highlight key elements of your code & how/why they address the OP's issue. You can read more guidance in the help section. Most upvotes are over time, when future visitors learn something from your post that they can apply to their own issues. – SherylHohman Feb 02 '21 at 18:46
  • 1
    Thanks, for your suggestion. Have modified it accordingly. Feel free to let me know if you see any further changes. – Ashwin Kumar Feb 10 '21 at 16:52
1

My minimal reproducible example does not invoke the error anymore, neither in an outdated environment with an old version of geopy (1.22) nor in an environment with the latest version of geopy (2.1.0), so I am not really sure what changed.

BodoB
  • 414
  • 1
  • 3
  • 7
  • Did it start working after some time? Then, is it because of the limited usage of the Nominatim servers per IP address? – Varad Pimpalkhute Mar 03 '21 at 08:18
  • 1
    @VaradPimpalkhute: The error was not caused by the Nominatim rate limit, the problem was caused by specific addresses. But I cannot reproduce it anymore, so it is speculative to pinpoint the error now. – BodoB Mar 05 '21 at 09:37
  • 1
    I see, I had the same error, but it got resolved after waiting for 5 minutes. Thanks for the reply. – Varad Pimpalkhute Mar 06 '21 at 10:56
  • I am seeing similar errors for getting the location of certain entities using Nominatin. It has nothing to do with the rate limiting :/ – lmurdock12 Feb 23 '22 at 18:17
  • Hey, got the same error regardless of the input (even the sample ones on the geopy documentation page). However when I looked for the coordinates through the website I didn't have any issues. Weird. – Adorable Sep 19 '22 at 14:39