10

I am using Python requests get method to query the MediaWiki API, but it takes a lot of time to receive the response. The same requests receive the response very fast through a web browser. I have the same issue requesting google.com. Here are the sample codes that I am trying in Python 3.5 on Windows 10:

response = requests.get("https://www.google.com")
response = requests.get("https://en.wikipedia.org/wiki/Main_Page")
response = requests.get("http://en.wikipedia.org/w/api.php?", params={'action':'query', 'format':'json', 'titles':'Labor_mobility'})

However, I don't face this issue retrieving other websites like:

response = requests.get("http://www.stackoverflow.com")
response = requests.get("https://www.python.org/")
1man
  • 5,216
  • 7
  • 42
  • 56
  • How long does the response take? – NendoTaka Jul 06 '16 at 01:27
  • Also python 3.5 is not officially supported by Python requests – NendoTaka Jul 06 '16 at 01:28
  • Did you try adding `stream=True` after the URL? Ex: `requests.get("https://www.google.com", stream=True)` – NendoTaka Jul 06 '16 at 01:30
  • @NendoTaka, Thank you for your responses. Usually they take less than 0.5 second, but with the issue that I face now, each of them take aboout 22 seconds! Also, I tried Python 2.7, but I have the same issue. I also tried requests.get("https://www.google.com", stream=True). It did not change anything. – 1man Jul 06 '16 at 01:35
  • You could try `.head` instead of `.get` but it is not supported by all sites and it may not give you the information that you need. – NendoTaka Jul 06 '16 at 01:40
  • @NendoTaka, I don't think .head will help me, because I need the content of the json. – 1man Jul 06 '16 at 01:52
  • I don't know if it will help but try adding `verify=False` – NendoTaka Jul 06 '16 at 02:10
  • @NendoTaka, it did not help. Do you experience the same issue when requesting those urls? Or it's just me? – 1man Jul 06 '16 at 02:14
  • This is almost certainly a network issue unrelated to python or requests, also don't use verify=False – Padraic Cunningham Jul 06 '16 at 09:50
  • @Padraic Cunningham, Thank you for your response. If it is related to the network, how can I fix it? Why do I receive the responses immediately when I send the same requests via a browser? I actually ended up using a selenium WebDriver in the same Python script and it works very fast. – 1man Jul 06 '16 at 13:12
  • @user2521204 set the httplib debug flag and you will see exactly what is happening https://docs.python.org/2/library/httplib.html#httplib.HTTPConnection.set_debuglevel – Padraic Cunningham Jul 06 '16 at 18:54
  • 2
    Same problem, it works for me: https://unix.stackexchange.com/questions/500286/ppa-addition-taking-too-long – Arthur G Aug 24 '19 at 14:47

2 Answers2

11

This sounds like there is an issue with the underlying connection to the server, because requests to other URLs work. These come to mind:

  • The server might only allow specific user-agent strings

Try adding innocuous headers, e.g.: requests.get("https://www.example.com", headers={"User-Agent": "Mozilla/5.0 (X11; CrOS x86_64 12871.102.0) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/81.0.4044.141 Safari/537.36"})

  • The server rate-limits you

Wait for a few minutes, then try again. If this solves your issue, you could slow down your code by adding time.sleep() to prevent being rate-limited again.

  • IPv6 does not work, but IPv4 does

Verify by executing curl --ipv6 -v https://www.example.com. Then, compare to curl --ipv4 -v https://www.example.com. If the latter is significantly faster, you might have a problem with your IPv6 connection. Check here for possible solutions.

Didn't solve your issue?

If that did not solve your issue, I have collected some other possible solutions here.

vauhochzett
  • 2,732
  • 2
  • 17
  • 40
0

try with these headers if the above didn't work.

requests.get("https://www.example.com", headers={"User-Agent": "Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/51.0.2704.103 Safari/537.36"})
abhijithvijayan
  • 835
  • 12
  • 17