2

I'm running a scraper that's going through a few domains and it's getting hung up on http://www.1000markets.com/

Checking from different sources it seems to be down. That's totally fine, but I'm getting the error mentioned in the title.

How can I escape this? I'm using HTTPerror and URLerror but it's still getting hung up.

Any help on this would be great

def get_html(link):
    import urllib2
    from urllib2 import Request, urlopen, URLError, HTTPError
    try:
        res = urllib2.urlopen(link)
        html = res.read()       

    except URLError as e:
        return link

    except HTTPError as e:
        return link

edit: Attached is the error

File "/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/urllib2.py", line 127, in urlopen
  return _opener.open(url, data, timeout)
File "/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/urllib2.py", line 404, in open
  response = self._open(req, data)
File "/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/urllib2.py", line 422, in _open
  '_open', req)
File "/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/urllib2.py", line 382, in _call_chain
  result = func(*args)
File "/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/urllib2.py", line 1214, in http_open
  return self.do_open(httplib.HTTPConnection, req)
File "/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/urllib2.py", line 1187, in do_open
  r = h.getresponse(buffering=True)
File "/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/httplib.py", line 1045, in getresponse
  response.begin()
File "/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/httplib.py", line 409, in begin
  version, status, reason = self._read_status()
File "/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/httplib.py", line 365, in _read_status
  line = self.fp.readline(_MAXLINE + 1)
File "/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/socket.py", line 476, in readline
  data = self._sock.recv(self._rbufsize)
socket.error: [Errno 54] Connection reset by peer
shartshooter
  • 1,761
  • 5
  • 19
  • 40
  • You asked same question twice: http://stackoverflow.com/questions/21334966/python-urllib2-errno-54-connection-reset-by-peer . The comment there might be the probable solution for your problem. – sudip Jul 27 '14 at 21:41
  • That URL is not reachable. One alternative would be to use `requests` module, which would catch `requests.exceptions.ConnectionError`. 1000markets domain is not reachable. hence the issue – karthikr Jul 27 '14 at 21:54
  • Possible duplicate of [Python handling socket.error: \[Errno 104\] Connection reset by peer](http://stackoverflow.com/questions/20568216/python-handling-socket-error-errno-104-connection-reset-by-peer) – bummi Feb 27 '16 at 07:34

0 Answers0