0

The problem is simple. I have this little code here:

from bs4 import BeautifulSoup
import requests
from bs4 import BeautifulSoup

url = requests.get("https://www.docenti.unina.it/#!/professor/47494f434f4e44414d4f5343415249454c4c4f4d5343474e4435344c36354634383143/avvisi")
soup = BeautifulSoup(url.content, "html.parser")  # Requesting the source code with bs4

print(url)

but if I try to run this I get this error (I apologize if this is a bit long):

Traceback (most recent call last):
  File "/home/simon/.local/lib/python3.8/site-packages/urllib3/connectionpool.py", line 670, in urlopen
httplib_response = self._make_request(
  File "/home/simon/.local/lib/python3.8/site-packages/urllib3/connectionpool.py", line 381, in _make_request
self._validate_conn(conn)
  File "/home/simon/.local/lib/python3.8/site-packages/urllib3/connectionpool.py", line 978, in _validate_conn
conn.connect()
  File "/home/simon/.local/lib/python3.8/site-packages/urllib3/connection.py", line 362, in connect
self.sock = ssl_wrap_socket(
  File "/home/simon/.local/lib/python3.8/site-packages/urllib3/util/ssl_.py", line 384, in ssl_wrap_socket
return context.wrap_socket(sock, server_hostname=server_hostname)
  File "/usr/lib/python3.8/ssl.py", line 500, in wrap_socket
return self.sslsocket_class._create(
  File "/usr/lib/python3.8/ssl.py", line 1040, in _create
self.do_handshake()
  File "/usr/lib/python3.8/ssl.py", line 1309, in do_handshake
self._sslobj.do_handshake()
ssl.SSLCertVerificationError: [SSL: CERTIFICATE_VERIFY_FAILED] certificate verify failed: unable to get local issuer certificate (_ssl.c:1108)

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/home/simon/.local/lib/python3.8/site-packages/requests/adapters.py", line 439, in send
resp = conn.urlopen(
  File "/home/simon/.local/lib/python3.8/site-packages/urllib3/connectionpool.py", line 726, in urlopen
retries = retries.increment(
  File "/home/simon/.local/lib/python3.8/site-packages/urllib3/util/retry.py", line 439, in increment
raise MaxRetryError(_pool, url, error or ResponseError(cause))
urllib3.exceptions.MaxRetryError: HTTPSConnectionPool(host='www.docenti.unina.it', port=443): Max retries exceeded with url: / (Caused by SSLError(SSLCertVerificationError(1, '[SSL: CERTIFICATE_VERIFY_FAILED] certificate verify failed: unable to get local issuer certificate (_ssl.c:1108)')))

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/home/simon/Linux_Storage/Projects/test_file/main.py", line 5, in <module>
url = requests.get("https://www.docenti.unina.it/#!/professor/47494f434f4e44414d4f5343415249454c4c4f4d5343474e4435344c36354634383143/avvisi")
  File "/home/simon/.local/lib/python3.8/site-packages/requests/api.py", line 76, in get
return request('get', url, params=params, **kwargs)
  File "/home/simon/.local/lib/python3.8/site-packages/requests/api.py", line 61, in request
return session.request(method=method, url=url, **kwargs)
  File "/home/simon/.local/lib/python3.8/site-packages/requests/sessions.py", line 530, in request
resp = self.send(prep, **send_kwargs)
  File "/home/simon/.local/lib/python3.8/site-packages/requests/sessions.py", line 643, in send
r = adapter.send(request, **kwargs)
  File "/home/simon/.local/lib/python3.8/site-packages/requests/adapters.py", line 514, in send
raise SSLError(e, request=request)
requests.exceptions.SSLError: HTTPSConnectionPool(host='www.docenti.unina.it', port=443): Max retries exceeded with url: /
 (Caused by SSLError(SSLCertVerificationError(1, '[SSL: CERTIFICATE_VERIFY_FAILED] certificate verify failed: unable to get local issuer certificate (_ssl.c:1108)')))

The real problem is that it only happens with this website. Using google or youtube or any other website I tried just works and gives me a <Response [200]>, but not this one. I read somewhere that this is a problem with the certification of the http or something like this but I'm not sure. I already tried Looking at many "solutions" here on stack but I can't find a a real one that suits for me. Any Ideas?

Simon
  • 107
  • 9
  • 2
    Does this answer your question? ["SSL: certificate\_verify\_failed" error when scraping https://www.thenewboston.com/](https://stackoverflow.com/questions/34503206/ssl-certificate-verify-failed-error-when-scraping-https-www-thenewboston-co) – ParthS007 Aug 25 '20 at 15:26

1 Answers1

0

For solving this error you can use verify=False into requests.get(url, verify=False).

For example:

from bs4 import BeautifulSoup
import requests

url = requests.get("https://www.docenti.unina.it/#!/professor/47494f434f4e44414d4f5343415249454c4c4f4d5343474e4435344c36354634383143/avvisi",verify=False)
soup = BeautifulSoup(url.content, "html.parser")  # Requesting the source code with bs4

print(url.status_code)

Result:

<Response [200]>
Humayun Ahmad Rajib
  • 1,502
  • 1
  • 10
  • 22