I'm trying to scrap data from crunchbase.com using python requests library.
When ever I've tried to get page source using requests library, urllib library it was giving my ssl error below.
import requests
requests.get('https://www.crunchbase.com/#/home/index')
Traceback (most recent call last):
File "C:\Program Files (x86)\Python36-32\lib\site-packages\urllib3\connectionpool.py", line 599, in urlopen
body=body, headers=headers)
File "C:\Program Files (x86)\Python36-32\lib\site-packages\urllib3\connectionpool.py", line 345, in _make_request
self._validate_conn(conn)
File "C:\Program Files (x86)\Python36-32\lib\site-packages\urllib3\connectionpool.py", line 843, in _validate_conn
conn.connect()
File "C:\Program Files (x86)\Python36-32\lib\site-packages\urllib3\connection.py", line 326, in connect
ssl_context=context)
File "C:\Program Files (x86)\Python36-32\lib\site-packages\urllib3\util\ssl_.py", line 325, in ssl_wrap_socket
return context.wrap_socket(sock, server_hostname=server_hostname)
File "C:\Program Files (x86)\Python36-32\lib\ssl.py", line 401, in wrap_socket
_context=self, _session=session)
File "C:\Program Files (x86)\Python36-32\lib\ssl.py", line 808, in __init__
self.do_handshake()
File "C:\Program Files (x86)\Python36-32\lib\ssl.py", line 1061, in do_handshake
self._sslobj.do_handshake()
File "C:\Program Files (x86)\Python36-32\lib\ssl.py", line 683, in do_handshake
self._sslobj.do_handshake()
ssl.SSLError: [SSL: CERTIFICATE_VERIFY_FAILED] certificate verify failed (_ssl.c:748)
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "<pyshell#1>", line 1, in <module>
requests.get('https://www.crunchbase.com/#/home/index')
File "C:\Program Files (x86)\Python36-32\lib\site-packages\requests\api.py", line 72, in get
return request('get', url, params=params, **kwargs)
File "C:\Program Files (x86)\Python36-32\lib\site-packages\requests\api.py", line 58, in request
return session.request(method=method, url=url, **kwargs)
File "C:\Program Files (x86)\Python36-32\lib\site-packages\requests\sessions.py", line 502, in request
resp = self.send(prep, **send_kwargs)
File "C:\Program Files (x86)\Python36-32\lib\site-packages\requests\sessions.py", line 612, in send
r = adapter.send(request, **kwargs)
File "C:\Program Files (x86)\Python36-32\lib\site-packages\requests\adapters.py", line 440, in send
timeout=timeout
File "C:\Program Files (x86)\Python36-32\lib\site-packages\urllib3\connectionpool.py", line 624, in urlopen
except (BaseSSLError, CertificateError) as e:
NameError: name 'CertificateError' is not defined
Also I've tried all the steps in requests library like, verify = false, verify = cacert.pem file and tried multipleways.
Also tried to scrap the same using selenium library, which returns only metadata, which doesn't having any useful info
It also states insecure connection while scraping with selenium
New to python scraping help please, using API only limited information are gathered, awaiting for help