I have a call to an amazon page for web scraping. But even with timeout_connect and timeout_response provided, some of the threads get stuck for hours sometimes. Snippet is below:
timeout_connect = 10
timeout_respose_factor = 20
r = requests.get(url,
headers=headers,
timeout=(timeout_connect, timeout_response),
verify=False)
It does timeout for some requests when it reaches this threshold and is not able to connect. But for a very few requests, some threads get stuck as it takes hours for requests to respond.
Am I missing something here? And if this might happen, is there a way I can keep track of this requests call in that thread and if its more than a particular time, I will just move forward, instead of waiting for requests?