3

I'm trying to download a PDF from the following address: https://www.sce.com/NR/sc3/tm2/pdf/ce12-12.pdf

I've been using the requests library like this:

url = 'https://www.sce.com/NR/sc3/tm2/pdf/ce12-12.pdf'
site = requests.get(url)
with open('outfile.pdf','wb') as outfile:
    outfile.write(site.content)

This week, I started receiving the following SSL error:

HTTPSConnectionPool(host='www1.sce.com', port=443): Max retries exceeded with url: /NR/sc3/tm2/pdf/ce12-12.pdf (Caused by SSLError(SSLError(1, '[SSL: CERTIFICATE_VERIFY_FAILED] certificate verify failed (_ssl.c:645)'),))

I've noticed that when I download the PDF in Chrome (via Dev Tools network monitoring), the GET request in the web browser returns code 302 and then makes a different GET request to the following url:

https://www1.sce.com/NR/sc3/tm2/pdf/ce12-12.pdf

If I tweak the code above and make a GET request with that URL, I receive the same SSL errors.

I've used this same code to download different PDFs from a different site, without errors:

url2 = 'https://www.pge.com/tariffs/assets/pdf/tariffbook/ELEC_SCHEDS_A-1.pdf'
site2 = requests.get(url2)
with open('outfile2.pdf','wb') as outfile:
    outfile.write(site2.content)

I updated SSL on my computer and recompiled Python to use the updated SSL configuration. I get the following output when I run this in Python:

import ssl
print(ssl.OPENSSL_VERSION)
>>>OpenSSL 1.0.2h  3 May 2016

I can get the first URL to download if I specify "verify=False" in the requests command, but I'd prefer not to do that.

Does anyone have any insight as to what's going on? I think my SSL is working correctly in general, because the second HTTPS request goes through without an issue. I think there is something else going on specific to the way the first URL deals with requests, but I'm not exactly sure. The only discernible difference I've seen between url and url2 is that url has a cookie in the request header. There are no cookies in the request header for url2.

I'm running Python 3.5.2 and version 2.20.1 of the requests library.

Thanks in advance.

  • 1
    The server [does not provide](https://www.ssllabs.com/ssltest/analyze.html?d=www1.sce.com) a proper certificate chain. You will have to add the intermediate certificate(s) to your trust store. How that happens depends on your OS. – Klaus D. Nov 20 '18 at 02:38
  • The server is broken - incomplete trust chain. You can get the missing chain certificate from [here](https://censys.io/certificates/154c433c491929c5ef686e838e323664a00e6a0d822ccc958fb4dab03e49a08f). Download it as PEM and it it to your trust store as described in the answers to the other questions. – Steffen Ullrich Nov 20 '18 at 05:23
  • Thanks for the link Steffen! I downloaded the pem certificate and added it to the end of the certificates file that the certifi Python library is using. That fixed my problem! – RainbowSchubert Nov 20 '18 at 19:09

0 Answers0