I spent a day to understand and fix this issue completely, so I thought it will be nice to share my findings with everybody :-)! Here are my results:
It is a common flaw in SSL server configurations to provide an incomplete chain of certificates, often omitting intermediate certificates. For instance, a site I was working with did not include the common DigiCert "intermediate" certificate "DigiCert TLS RSA SHA256 2020 CA1" in the server's response.
Because this configuration flaw is common, most but not all modern browsers implement a technique called "AIA Fetching" to fix this on the fly (see e.g. https://www.thesslstore.com/blog/aia-fetching/).
Python's SSL support does not support AIA Fetching and depends on a complete chain of certificates from the server; otherwise it throws an exception, like so
SSLError(SSLCertVerificationError(1, '[SSL: CERTIFICATE_VERIFY_FAILED] certificate verify failed: unable to get local issuer certificate (_ssl.c:1124)')))
There is an ongoing discussion about whether AIA Fetching should be added to Python, e.g. in this thread: https://bugs.python.org/issue18617#msg293894.
My impression is that this will remain an open issue for the foreseeable future.
Now, how can we fix that?
- Install
certifi
, if you have not done so, or update it
pip install certifi
or
pip install certifi --upgrade
Many (but not all) Python modules can use the certificates from certifi
, and certifi
takes them from the Mozilla CA Certificate initiative (https://wiki.mozilla.org/CA). Basically, certifi creates a clean *.pem file from the Mozilla site and provides a lightweight Python interface for accessing that file.
Download the missing certificate as a file in PEM syntax, e.g. from https://www.digicert.com/kb/digicert-root-certificates.htm, or from a trusted browser.
Locate the certifi *.PEM certificate file with
import certifi
print(certifi.where())
Note: I recommend to first activate the virtual environment (e.g. conda activate <envname>
) you want to use the certificate with. The file path will differ. If you apply this to your base environment, any potential flawed certificate will put the entire SSL mechanism for all your code at risk.
Example:
/Users/username/anaconda3/envs/environment_name/lib/python3.8/site-packages/certifi/cacert.pem
Take a simple text editor, open that file, and insert the missing certificate at the beginning right after the header, like so
##
## Bundle of CA Root Certificates
##
...
-----BEGIN CERTIFICATE-----
+I2tIJLYrVJmuzHZ9bjPvXj1hJeRPG/cUJ9WIQDgLGB
Afr5yjK7tI4nhyfFK3TUqNaX3sNk+crOU6J
---> This is the additional certificate.
+I2tIJLYrVJmuzHZ9bjPvXj1hJeRPG/cUJ9WIQDgLGB
Afr5yjK7tI4nhyfFK3TUqNaX3sNk+crOU6J
-----END CERTIFICATE-----
It is important to include the begin and end markers.
Save the file and you should be all set!.
You can test that it works with the following few lines:
# Python 3
import urllib.request import certifi import requests
URL = 'https://www.the_url_that_caused_the_trouble.org'
print('Trying urllib.request.urlopen().')
r = urllib.request.urlopen(URL)
print(f'urllib.request.urlopen\n================\n {r.read()[:80]}')
print('Trying requests.get().')
r = requests.get(URL)
print(f'requests.get()\n================\n {r.text[:80]}')
Note: The general SSL certificates, e.g. for openssl, might be located elsewhere, so you may have to try the same approach there:
/Users/username/anaconda3/envs/environment_name/ssl
Voila!
Notes:
- When you update
certifi
or create a new virtual environment, the changes will likely be lost, but I think that is actually good design, because it does not perpetuate a temporary security tweak to your entire system.
- Naturally, the process of downloading the certificate is a potential security risk - if that download is compromised, your entire SSL chain might be, too.
- The maintenance of
certifi
lags behind the Mozilla releases of certificates. If you want to use the most current version of the Mozilla CA bundles with certifi
, you can use my script from https://github.com/mfhepp/update_certifi_certificates.