0

I'm getting an SSLError when I'm trying to scrape a website with BeautifulSoup4 but I'm using urllib to just open the link. I got it to work in Java with JSoup (scraper for Java) by generating the certificate with the website I'm trying to scrape from here. Is it possible to use that certificate in Python? The file that was generated was called jssecacerts but renamed to cacerts.

This is the code I'm using:

def open(url, postdata=None):
    if postdata is not None:
        postdata = urlencode(postdata)
        postdata = postdata.encode('utf-8')
    return browser.open(url, postdata).read()

def login(i):
    cj = http.cookiejar.CookieJar()
    global browser
    browser = urllib.request.build_opener(urllib.request.HTTPCookieProcessor(cj))

    POST = {'timezoneOffset': TIMEZONEOFFSET,
            'userid': USERID,
            'pwd': PWD,
            }

    open(LOGIN_URL, POST)

The error:

Process Process-1:
Traceback (most recent call last):
Process Process-2:
Traceback (most recent call last):
  File "C:\Python34\lib\urllib\request.py", line 1182, in do_open
    h.request(req.get_method(), req.selector, req.data, headers)
  File "C:\Python34\lib\urllib\request.py", line 1182, in do_open
    h.request(req.get_method(), req.selector, req.data, headers)
  File "C:\Python34\lib\http\client.py", line 1088, in request
    self._send_request(method, url, body, headers)
  File "C:\Python34\lib\http\client.py", line 1088, in request
    self._send_request(method, url, body, headers)
  File "C:\Python34\lib\http\client.py", line 1126, in _send_request
    self.endheaders(body)
  File "C:\Python34\lib\http\client.py", line 1126, in _send_request
    self.endheaders(body)
  File "C:\Python34\lib\http\client.py", line 1084, in endheaders
    self._send_output(message_body)
  File "C:\Python34\lib\http\client.py", line 922, in _send_output
    self.send(msg)
  File "C:\Python34\lib\http\client.py", line 1084, in endheaders
    self._send_output(message_body)
  File "C:\Python34\lib\http\client.py", line 857, in send
    self.connect()
  File "C:\Python34\lib\http\client.py", line 1231, in connect
    server_hostname=server_hostname)
  File "C:\Python34\lib\http\client.py", line 922, in _send_output
    self.send(msg)
  File "C:\Python34\lib\http\client.py", line 857, in send
    self.connect()
  File "C:\Python34\lib\http\client.py", line 1231, in connect
    server_hostname=server_hostname)
  File "C:\Python34\lib\ssl.py", line 365, in wrap_socket
    _context=self)
  File "C:\Python34\lib\ssl.py", line 583, in __init__
    self.do_handshake()
  File "C:\Python34\lib\ssl.py", line 810, in do_handshake
    self._sslobj.do_handshake()
ssl.SSLError: [SSL: CERTIFICATE_VERIFY_FAILED] certificate verify failed     (_ssl.c:600)

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "C:\Python34\lib\ssl.py", line 365, in wrap_socket
    _context=self)
  File "C:\Python34\lib\ssl.py", line 583, in __init__
    self.do_handshake()
  File "C:\Python34\lib\ssl.py", line 810, in do_handshake
    self._sslobj.do_handshake()
ssl.SSLError: [SSL: CERTIFICATE_VERIFY_FAILED] certificate verify failed (_ssl.c:600)

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "C:\Python34\lib\multiprocessing\process.py", line 254, in _bootstrap
    self.run()
  File "C:\Python34\lib\multiprocessing\process.py", line 93, in run
    self._target(*self._args, **self._kwargs)
  File "C:\Python34\lib\multiprocessing\process.py", line 254, in _bootstrap
    self.run()
  File "C:\Python34\lib\multiprocessing\process.py", line 93, in run
    self._target(*self._args, **self._kwargs)
  File "E:\ENG_CFR_PS_Map\eng_cfr_map.py", line 159, in main
    login(i)
  File "E:\ENG_CFR_PS_Map\eng_cfr_map.py", line 159, in main
    login(i)
  File "E:\ENG_CFR_PS_Map\eng_cfr_map.py", line 109, in login
    open(LOGIN_URL, POST)
  File "E:\ENG_CFR_PS_Map\eng_cfr_map.py", line 109, in login
    open(LOGIN_URL, POST)
  File "E:\ENG_CFR_PS_Map\eng_cfr_map.py", line 33, in open
    return browser.open(url, postdata).read()
  File "C:\Python34\lib\urllib\request.py", line 463, in open
    response = self._open(req, data)
  File "C:\Python34\lib\urllib\request.py", line 481, in _open
    '_open', req)
  File "E:\ENG_CFR_PS_Map\eng_cfr_map.py", line 33, in open
    return browser.open(url, postdata).read()
  File "C:\Python34\lib\urllib\request.py", line 441, in _call_chain
    result = func(*args)
  File "C:\Python34\lib\urllib\request.py", line 463, in open
    response = self._open(req, data)
  File "C:\Python34\lib\urllib\request.py", line 1225, in https_open
    context=self._context, check_hostname=self._check_hostname)
  File "C:\Python34\lib\urllib\request.py", line 1184, in do_open
    raise URLError(err)
  File "C:\Python34\lib\urllib\request.py", line 481, in _open
    '_open', req)
urllib.error.URLError: <urlopen error [SSL: CERTIFICATE_VERIFY_FAILED]     certificate verify failed (_ssl.c:600)>
  File "C:\Python34\lib\urllib\request.py", line 441, in _call_chain
    result = func(*args)
  File "C:\Python34\lib\urllib\request.py", line 1225, in https_open
    context=self._context, check_hostname=self._check_hostname)
  File "C:\Python34\lib\urllib\request.py", line 1184, in do_open
    raise URLError(err)
urllib.error.URLError: <urlopen error [SSL: CERTIFICATE_VERIFY_FAILED]     certificate verify failed (_ssl.c:600)>
silverAndroid
  • 960
  • 3
  • 11
  • 29
  • A stacktrace or the actual error message would be helpful. And just as a side note: it is generally considered bad practice to assign to method parameters inside said method. – Dave Aug 07 '15 at 20:17
  • Try following this: http://stackoverflow.com/a/28052583/1102395 – Samar Aug 07 '15 at 22:30

1 Answers1

0

Can be proxy related. If you're behind a proxy, try setting the system environment variables for HTTP_PROXY and HTTPS_PROXY. Modifying cacerts should also be done as part of this as you suggested.

Proxy usually something like HTTP_PROXY=http://proxy.mycompany.com:80

etc

NikT
  • 1,590
  • 2
  • 16
  • 29