2

Hi I have a program which just connects to web to read the website data. But when i run the program i am getting some complicated error regarding the web connection. Please see the program below,

from urllib.request import urlopen
html = urlopen("http://www.pythonscraping.com/pages/page1.html")
print(html.read())

The error message is shown below,

Traceback (most recent call last):
  File "C:\Python34\lib\urllib\request.py", line 1174, in do_open
    h.request(req.get_method(), req.selector, req.data, headers)
  File "C:\Python34\lib\http\client.py", line 1090, in request
    self._send_request(method, url, body, headers)
  File "C:\Python34\lib\http\client.py", line 1128, in _send_request
  self.endheaders(body)
  File "C:\Python34\lib\http\client.py", line 1086, in endheaders
    self._send_output(message_body)
  File "C:\Python34\lib\http\client.py", line 924, in _send_output
  self.send(msg)
  File "C:\Python34\lib\http\client.py", line 859, in send
    self.connect()
  File "C:\Python34\lib\http\client.py", line 836, in connect
    self.timeout, self.source_address)
  File "C:\Python34\lib\socket.py", line 491, in create_connection
  for res in getaddrinfo(host, port, 0, SOCK_STREAM):
  File "C:\Python34\lib\socket.py", line 530, in getaddrinfo
 for res in _socket.getaddrinfo(host, port, family, type, proto, flags):
socket.gaierror: [Errno 11002] getaddrinfo failed
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
  File "C:\Python34\scrapetest.py", line 2, in <module>
    html = urlopen("http://www.pythonscraping.com/pages/page1.html")
  File "C:\Python34\lib\urllib\request.py", line 153, in urlopen
return opener.open(url, data, timeout)
  File "C:\Python34\lib\urllib\request.py", line 455, in open
response = self._open(req, data)
  File "C:\Python34\lib\urllib\request.py", line 473, in _open
 '_open', req)
File "C:\Python34\lib\urllib\request.py", line 433, in _call_chain
result = func(*args)
File "C:\Python34\lib\urllib\request.py", line 1202, in http_open
return self.do_open(http.client.HTTPConnection, req)
File "C:\Python34\lib\urllib\request.py", line 1176, in do_open
raise URLError(err)
urllib.error.URLError: <urlopen error [Errno 11002] getaddrinfo failed>

Please help on this.

  • 1
    The code looks fine to me and runs as expected when I try it locally. Where do you run it? Looks to me like a connection problem. – robsn Mar 30 '16 at 13:35
  • I ran it in my office desktop ,and it seems the connection is okay, as i am able to browse. Also here they are using proxy server ,would that be a cause for this? – Cookie byte Mar 30 '16 at 15:42
  • That's very well possible. I have no experience using a proxy but [this answer](http://stackoverflow.com/a/3168244/279610) might help. – robsn Mar 30 '16 at 15:49
  • this can be made working by using requests and passing the proxies. – Cookie byte May 23 '16 at 11:01

0 Answers0