3

I am attempting to scrape text off of websites, and I am using the requests module to do so.

With the given code (Facebook as an example here)

requests.get('http://facebook.com')

I receive back the following error:

SSLError: HTTPSConnectionPool(host='facebook.com', port=443): Max retries exceeded with url: / (Caused by SSLError(SSLError("bad handshake: Error([('SSL routines', 'tls_process_server_certificate', 'certificate verify failed')])")))

I have tried the following with no luck:

pip install certifi
pip install certifi_win32

Any help would be greatly appreciated! Thank you!

3 Answers3

8

You can try this way

import requests
from urllib3.exceptions import InsecureRequestWarning
from urllib3 import disable_warnings

disable_warnings(InsecureRequestWarning)

page = requests.get('http://facebook.com', verify=False)

print(page.content)
papi3301
  • 93
  • 5
1

The problem probably stems from an overly aggressive security measure that you might be able to fix in two steps:

  1. Download the raw CA Bundle from https://certifiio.readthedocs.io/en/latest/
  2. In requests.get use verify=[path to raw CA Bundle]
Chris
  • 11
  • 4
0

I had the same problem while I scraping a website. I tried setting up False to verify and I used CA certificate, both didn't work. Reading through the docs I find Session Objects

I was making several requests to the same host for that reason I was getting "Max retries exceeded with url"

In your case, you can try this:

s = requests.Session()
response = s.get('http://facebook.com')