1

I'using Requests to scrape webpages and have encounter in a couple of instances issues with the website SSL certificate. I would like to implement a logic whereas the first request is done with verify=true but if there is a SSLError(SSLCertVerificationError(1, '[SSL: CERTIFICATE_VERIFY_FAILED] then it retries with verify=false.

Here is my initial code, what I'm struggling with is catching the error and passing it to the retry function.

#MAKE FIRST REQUEST
r = requests_retry_session().get(url,  headers=headers, timeout=10)
#RETRY FUNCTION
def requests_retry_session(
        retries=5,
        backoff_factor= 10,
        status_forcelist=(500, 502, 504),
        session=None,
):
    session = session or requests.Session()
    retry = Retry(
        total=retries,
        read=retries,
        connect=retries,
        backoff_factor=backoff_factor,
        status_forcelist=status_forcelist,
    )
    adapter = HTTPAdapter(max_retries=retry)
    session.mount('http://', adapter)
    session.mount('https://', adapter)
#IF SSLERROR set verify to false
    session.verify = False
    return session
Piripozzo
  • 45
  • 5
  • Wrap the code inside ``try:.. except:..`` block and retry by setting verify to ``False`` on error. – sushanth Jun 08 '20 at 08:34
  • From the security point of view it does not make any difference if you fall back to an unverified connection or if you allow unverified connections in the first place. – Klaus D. Jun 08 '20 at 08:35
  • @KlausD. That's very interesting. I've been reading that you should use an unverified connection only if really necessary. So I was thinking to use it as fallback instead as per default. Why are they saying that if there is no difference? Thanks – Daniela Jun 08 '20 at 08:38
  • If the connection is attacked an SSLError will be the result. Falling back to an unverified connection then will allow the attack to be successful. – Klaus D. Jun 08 '20 at 08:40
  • Makes sense. Thanks @KlausD. – Daniela Jun 08 '20 at 08:43
  • Does this answer your question? [Max retries exceeded with URL in requests](https://stackoverflow.com/questions/23013220/max-retries-exceeded-with-url-in-requests) – rofrol Dec 05 '21 at 19:24

1 Answers1

1

Peter Bengtsson answered this in a comment on his site here:

from requests.exceptions import SSLError
session = requests.retry_session()
try:
   r = session.get(url)
except SSLError:
   r = session.get(url, verify=False)
rofrol
  • 14,438
  • 7
  • 79
  • 77
Simon
  • 480
  • 1
  • 3
  • 11