0

I'm trying to connect to a website with the code below. There is no problem on Heroku. But I am getting error in DigitalOcean.

Code:

headers = {"User-Agent":"Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_4) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/83.0.4103.97 Safari/537.36"}

def web():
  rq = requests.get("https://myurl.com/gts?search=word", headers = headers)
  print(rq)

The error I get in DigitalOcean:

HTTPSConnectionPool(host='myurl.com', port=443): Max retries exceeded with url: /gts?search=word (Caused by ConnectTimeoutError(<urllib3.connection.HTTPSConnection object at 0x7fa364a89690>, 'Connection to myurl.com timed out. (connect timeout=None)'))

What I've done and failed:

I added code verify=False, disabled firewall from console. And I tried to access the site from the console with the "curl -I myurl.com" command. All failed.

Thank you for your help.

Emir
  • 25
  • 6
  • Does this answer your question? [Max retries exceeded with URL in requests](https://stackoverflow.com/questions/23013220/max-retries-exceeded-with-url-in-requests) – sahasrara62 Feb 17 '23 at 20:33
  • Isn't this caused by `myurl.com` web server firewall? Digital Oceans IP can just be blocked by it. Without exact url is this issue difficult to debug. Also it's not clear whether `myurl.com` is your website or not. – FN_ Feb 17 '23 at 20:37
  • @sahasrara62 I saw this thread but it didn't help. – Emir Feb 17 '23 at 20:43
  • @FN_ This happens only on sozluk.gov.tr. Myurl.com is actually sozluk.gov.tr. – Emir Feb 17 '23 at 20:43
  • @Emir I tried to run `curl` or even your code snippet on a Digital Ocean hosted machine. Both worked fine. It seems to me that you are having bad luck and your IP is blocked, because your code is valid. – FN_ Feb 17 '23 at 21:17

1 Answers1

1

It seems like the issue is not caused by a bug in your code, but rather by firewall (or other rules) on the remote server side (sozluk.gov.tr).

I have tried to run curl -I https://sozluk.gov.tr and also your snippet on my local and also remote (Digital Ocean hosted) VM - both worked fine.

You mentioned that curl command from console (I assume remove VM console) did not work. This indicates, that the issue is on the network - rather than code - level.


I recommend to spin new VM in different region (to get IP from different IP pool) or use some kind of a proxy which is not blocked by remote server. You can check available regions (datacenters) on this link.


Responses

user@do-server:~$ curl -I https://sozluk.gov.tr
HTTP/1.1 200 OK
Server: nginx
Date: Sat, 18 Feb 2023 11:18:37 GMT
Content-Type: text/html; charset=UTF-8
Content-Length: 108975
Last-Modified: Fri, 13 Jan 2023 12:38:15 GMT
Connection: keep-alive
Vary: Accept-Encoding
ETag: "63c150b7-1a9af"
Access-Control-Allow-Origin: *
Access-Control-Allow-Methods: GET, POST, OPTIONS
Accept-Ranges: bytes
user@do-server:~$ python3
Python 3.10.6 (main, Nov 14 2022, 16:10:14) [GCC 11.3.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import requests
>>> headers = {"User-Agent":"Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_4) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/83.0.4103.97 Safari/537.36"}
>>> requests.get("https://sozluk.gov.tr/gts?search=word", headers = headers)
<Response [200]>
FN_
  • 715
  • 9
  • 27