0

While using the Python requests module to check for broken links (expected behavior being returned error codes for unavailable URLs such as 404, etc.), I encountered an older link on a page for a domain which no longer exists, and a DNS error was returned.

<snip>
http://listen.grooveshark.com/s/Folk+Song/2Np5i?src=5
Traceback (most recent call last):
  File "/usr/local/lib/python3.7/site-packages/urllib3/connection.py", line 171, in _new_conn
    (self._dns_host, self.port), self.timeout, **extra_kw)
  File "/usr/local/lib/python3.7/site-packages/urllib3/util/connection.py", line 56, in create_connection
    for res in socket.getaddrinfo(host, port, family, socket.SOCK_STREAM):
  File "/usr/local/Cellar/python/3.7.0/Frameworks/Python.framework/Versions/3.7/lib/python3.7/socket.py", line 748, in getaddrinfo
    for res in _socket.getaddrinfo(host, port, family, type, proto, flags):
socket.gaierror: [Errno 8] nodename nor servname provided, or not known

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/usr/local/lib/python3.7/site-packages/urllib3/connectionpool.py", line 600, in urlopen
    chunked=chunked)
  File "/usr/local/lib/python3.7/site-packages/urllib3/connectionpool.py", line 354, in _make_request
    conn.request(method, url, **httplib_request_kw)
  File "/usr/local/Cellar/python/3.7.0/Frameworks/Python.framework/Versions/3.7/lib/python3.7/http/client.py", line 1229, in request
    self._send_request(method, url, body, headers, encode_chunked)
  File "/usr/local/Cellar/python/3.7.0/Frameworks/Python.framework/Versions/3.7/lib/python3.7/http/client.py", line 1275, in _send_request
    self.endheaders(body, encode_chunked=encode_chunked)
  File "/usr/local/Cellar/python/3.7.0/Frameworks/Python.framework/Versions/3.7/lib/python3.7/http/client.py", line 1224, in endheaders
    self._send_output(message_body, encode_chunked=encode_chunked)
  File "/usr/local/Cellar/python/3.7.0/Frameworks/Python.framework/Versions/3.7/lib/python3.7/http/client.py", line 1016, in _send_output
    self.send(msg)
  File "/usr/local/Cellar/python/3.7.0/Frameworks/Python.framework/Versions/3.7/lib/python3.7/http/client.py", line 956, in send
    self.connect()
  File "/usr/local/lib/python3.7/site-packages/urllib3/connection.py", line 196, in connect
    conn = self._new_conn()
  File "/usr/local/lib/python3.7/site-packages/urllib3/connection.py", line 180, in _new_conn
    self, "Failed to establish a new connection: %s" % e)
urllib3.exceptions.NewConnectionError: <urllib3.connection.HTTPConnection object at 0x1088ff748>: Failed to establish a new connection: [Errno 8] nodename nor servname provided, or not known

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/usr/local/lib/python3.7/site-packages/requests/adapters.py", line 445, in send
    timeout=timeout
  File "/usr/local/lib/python3.7/site-packages/urllib3/connectionpool.py", line 638, in urlopen
    _stacktrace=sys.exc_info()[2])
  File "/usr/local/lib/python3.7/site-packages/urllib3/util/retry.py", line 398, in increment
    raise MaxRetryError(_pool, url, error or ResponseError(cause))
urllib3.exceptions.MaxRetryError: HTTPConnectionPool(host='listen.grooveshark.com', port=80): Max retries exceeded with url: /s/Folk+Song/2Np5i?src=5 (Caused by NewConnectionError('<urllib3.connection.HTTPConnection object at 0x1088ff748>: Failed to establish a new connection: [Errno 8] nodename nor servname provided, or not known'))

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "./four_finder.py", line 64, in <module>
    do_main()
  File "./four_finder.py", line 48, in do_main
    resp = requests.get(link)
  File "/usr/local/lib/python3.7/site-packages/requests/api.py", line 72, in get
    return request('get', url, params=params, **kwargs)
  File "/usr/local/lib/python3.7/site-packages/requests/api.py", line 58, in request
    return session.request(method=method, url=url, **kwargs)
  File "/usr/local/lib/python3.7/site-packages/requests/sessions.py", line 512, in request
    resp = self.send(prep, **send_kwargs)
  File "/usr/local/lib/python3.7/site-packages/requests/sessions.py", line 622, in send
    r = adapter.send(request, **kwargs)
  File "/usr/local/lib/python3.7/site-packages/requests/adapters.py", line 513, in send
    raise ConnectionError(e, request=request)
requests.exceptions.ConnectionError: HTTPConnectionPool(host='listen.grooveshark.com', port=80): Max retries exceeded with url: /s/Folk+Song/2Np5i?src=5 (Caused by NewConnectionError('<urllib3.connection.HTTPConnection object at 0x1088ff748>: Failed to establish a new connection: [Errno 8] nodename nor servname provided, or not known'))
[user@host:~/dev|22:13:13]
$ nslookup listen.grooveshark.com
(...)

** server can't find listen.grooveshark.com: SERVFAIL

[user@host:~/dev|22:28:05]

What is the best way to preempt this error from occurring? I can't be confident of future input URLs existing (as this example does not exist).

eyllanesc
  • 235,170
  • 19
  • 170
  • 241
user3.1415927
  • 367
  • 3
  • 19

1 Answers1

0

This is solved by using a try/except block and requests.ConnectionError. I hadn't searched SE for the right terms, and a useful answer is here.

        try:
            resp = requests.get(link)
        except requests.ConnectionError as e:
            ### TODO: Properly handle this DNS error! 
            print('DNS ERROR: %s' % link) 
            continue 

Also of interest is this answer.

user3.1415927
  • 367
  • 3
  • 19