0

I'm trying the use urllib.request.urlopen on a website starting with "https". Error output is: ssl.SSLError: [SSL: CERTIFICATE_VERIFY_FAILED] certificate verify failed

There are many great threads which cover this error. Including this one which mentions SSL Labs rating. I am able to use urllib.request.urlopen on every other "https" site I have tested.

SSL Labs shows the following output:

Key                        RSA 2048 bits (e 65537)
Issuer                     Let's Encrypt Authority X3
AIA:                       http://cert.int-x3.letsencrypt.org/
Signature algorithm        SHA256withRSA
Extended Validation        No
Certificate Transparency   No
OCSP Must Staple           No
Revocation information     OCSP 
Revocation status          Good (not revoked)
DNS CAA                    No (more info)
Trusted                    Yes

To clarify, my question is: is there a solution for completing the handshake, that doesn't include bypassing the certificate verification? And if there is a solution, can it be solved entirely inside a python script on linux, macOS and Windows?

jww
  • 97,681
  • 90
  • 411
  • 885
Bob
  • 31
  • 6
  • Please provide the server name or URL you are using. Why would you omit critical information? This is probably your duplicate: [Add SSL CA File Using urllib2](https://stackoverflow.com/q/30903544/608639). This is how to troubleshot it: [CERTIFICATE_VERIFY_FAILED when using urllib to connect to almerys.com](https://stackoverflow.com/a/45339280/608639). – jww Jul 27 '17 at 20:38
  • If the browser (client library, etc) already has the missing piece it can finish the handshake, but that means you're reducing the number of clients who can talk to your server. Anyhow -- are you asking about working around this from the client side (as by adding the missing intermediate CA cert to the client's store), or about fixing the server (as you really should, if it's under your / your organization's control)? – Charles Duffy Jul 27 '17 at 20:48
  • @jww I omitted that _critical information_ because I didn't think it was critical. I figured this was a generic issue that others might have. I'm not sure how my reasoning and thought-process is relevant to solving the issue, though. The first sentence in your comment stands well on its own without the second sentence. Thank you for the second link you provided, though. I was able to download the .pem file from Let's Encrypt and that solved the issue. I have detailed the steps in [this solution](https://stackoverflow.com/a/45362591/7707659) – Bob Jul 27 '17 at 23:48
  • (I'd argue the primary importance of a URL, btw, to be allowing folks to actually test solutions to work in your real-world use case; without a test case, an attempted solution is guesswork -- neither the person writing it nor another site user trying to determine whether to vote something up/down it can determine with 100% certainty whether it's *actually correct* for the real-world scenario encountered). – Charles Duffy Jul 28 '17 at 02:45
  • (...the "missing intermediate certificate" interpretation of the problem is definitely the most obvious thing that fits with the information given, but I'm not going to assert that it's the only possible cause of the error in question). – Charles Duffy Jul 28 '17 at 02:46

3 Answers3

2

I cannot answer this question for urllib, but I was able to overcome this problem using python requests instead. Note this will only work if there is a trusted certificate chain for the website in question but the server is perhaps missing a root or intermediate certificate.

Using SSL labs server test (linked here), run the test and scroll down to certification paths. IF it IS the case that there are trusted cert paths, but the server is for some reason not providing the complete chain, you can download the full trusted path here as text. Copy the full certificate chain and save it as a .pem file and pass the path of this file to the requests function:

r = requests.get(url, verify = "path/to/chain.pem")

The requests module can throw all sorts of SSL related certification failures, many of which will be server side problems, and you really want to avoid disabling the SSL verification. This solution is only for the somewhat rare case where a full certificate exists but the issuer or server in question has sloppily omitted either a root or intermediate cert.

1

You can work around this by adding your missing intermediate certificate to the active X509Store:

cert_text = '''
-----BEGIN CERTIFICATE-----
...put your actual certificate text here...
-----END CERTIFICATE-----
'''

# fill this out depending on which specific intermediate cert you're missing
missing_cert = OpenSSL.crypto.load_certificate(OpenSSL.crypto.FILETYPE_PEM, cert_text)

context = ssl.create_default_context() # load default trusted certificates
store = context.get_cert_store()       # get the X509Store for that context
store.add_cert(missing_cert)           # add your missing cert to it

urllib.request.urlopen(site, context=context)

Note that if you only need to talk to the one server for which you're doing this, you could just pass an appropriate cafile or capath argument to create_default_context().

Charles Duffy
  • 280,126
  • 43
  • 390
  • 441
  • @ Charles Duffy I just read your response to the quasi-solution I posted. I'm quite new to this so I don't quite understand what you have posted. I'm very interested in your solution. What should replace the `...` in `missing_cert = OpenSSL.crypto.load_publickey(...)`? – Bob Jul 28 '17 at 02:37
  • That depends on how the copy of the key you downloaded and are bundling into the source is encoded. If it's PEM (which is usual), your first argument would be `OpenSSL.crypto.FILETYPE_PEM`; regardless, the second argument would be the buffer where you have the (in that case, PEM-encoded) certificate. – Charles Duffy Jul 28 '17 at 02:40
  • @ Charles Duffy I think I understand. I would download the .pem from [https://letsencrypt.org/certs/lets-encrypt-x3-cross-signed.pem](https://letsencrypt.org/certs/lets-encrypt-x3-cross-signed.pem). Then include this file in the github repo root directory. Then, in the python script, I would include `missing_cert = OpenSSL.crypto.load_certificate(OpenSSL.crypto.FILETYPE_PEM, PATH_TO_PEM_FILE_HERE)` followed by the additional 4 lines you wrote? – Bob Jul 28 '17 at 02:50
  • Not the path to the PEM file, but the text of the PEM file -- you could put the PEM file's contents straight in your Python source as a string, as in the example above (`BEGIN CERTIFICATE` and `END CERTIFICATE` are both things you can find in the PEM file itself). – Charles Duffy Jul 28 '17 at 02:55
  • @ Charles Duffy I added the line `import ssl, OpenSSL`. But now I am receiving error `AttributeError: 'SSLContext' object has no attribute 'get_cert_store'` at line `store = context.get_cert_store()`. I googled the error, but nothing is coming up. Any ideas? – Bob Jul 28 '17 at 03:26
  • I found something that is working, but I'm unsure if it will have any unforeseen consequences. I downloaded the .pem file to my working directory. The only 3 lines are `context = ssl.create_default_context()` followed by `context.load_verify_locations(cafile='/path/to/filename.pem')` and `page = urllib.request.urlopen(site_url_here, context=context).read()` – Bob Jul 28 '17 at 06:29
  • That'll be fine so long as you don't need to connect to any servers *not* using the intermediate certificate in question. – Charles Duffy Jul 28 '17 at 16:56
0

I was able to solve this issue (for debian-based system, I'm running Debian 9). I still need to test solutions on macOS and Windows.

On the SSL Labs report, under the "Certification Paths" header, it showed:

1   Sent by server  www.exampleSITE.com

BLAH BLAH BLAH

2   Extra download  Let's Encrypt Authority X3

BLAH BLAH BLAH

3   In trust store  DST Root CA X3   Self-signed

BLAH BLAH BLAH

I navigated to /etc/ssl/certs/ and noticed there was no Let's Encrypt certificates present. I then downloaded the .pem and rehashed.

cd /etc/ssl/certs
sudo wget https://letsencrypt.org/certs/lets-encrypt-x3-cross-signed.pem
sudo c_rehash

Then I tested the python line that was giving me an error earlier

page = urllib.request.urlopen('https://www.exampleSITE.com').read()

and it successfully retrieved the page.

Bob
  • 31
  • 6
  • @Charles Duffy To clarify: I am writing a python script that opens and reads from a website (not mine) and parses some information to be used later in the script. I could access that website from firefox, but not when using urllib in my python script. I can ask users that run my script to download the Let's Encrypt .pem file before running it. – Bob Jul 28 '17 at 00:20
  • Gotcha. If you used my answer, you wouldn't need to ask your users to modify anything in `/etc`, since you'd be embedding the missing certificate directly inside the script that needs it. – Charles Duffy Jul 28 '17 at 02:26