Max retries exceeded with URL in requests

Question

I'm trying to get the content of App Store > Business:

import requests
from lxml import html

page = requests.get("https://itunes.apple.com/in/genre/ios-business/id6000?mt=8")
tree = html.fromstring(page.text)

flist = []
plist = []
for i in range(0, 100):
    app = tree.xpath("//div[@class='column first']/ul/li/a/@href")
    ap = app[0]
    page1 = requests.get(ap)

When I try the range with (0,2) it works, but when I put the range in 100s it shows this error:

Traceback (most recent call last):
  File "/home/preetham/Desktop/eg.py", line 17, in <module>
    page1 = requests.get(ap)
  File "/usr/local/lib/python2.7/dist-packages/requests/api.py", line 55, in get
    return request('get', url, **kwargs)
  File "/usr/local/lib/python2.7/dist-packages/requests/api.py", line 44, in request
    return session.request(method=method, url=url, **kwargs)
  File "/usr/local/lib/python2.7/dist-packages/requests/sessions.py", line 383, in request
    resp = self.send(prep, **send_kwargs)
  File "/usr/local/lib/python2.7/dist-packages/requests/sessions.py", line 486, in send
    r = adapter.send(request, **kwargs)
  File "/usr/local/lib/python2.7/dist-packages/requests/adapters.py", line 378, in send
    raise ConnectionError(e)
requests.exceptions.ConnectionError: HTTPSConnectionPool(host='itunes.apple.com', port=443): Max retries exceeded with url: /in/app/adobe-reader/id469337564?mt=8 (Caused by <class 'socket.gaierror'>: [Errno -2] Name or service not known)

you are like requesting the same app a 100 times. what is that for ? — njzk2, Apr 11 '14 at 13:00
I am using i in the rest of the code. I have not posted the entire code — user3446000, Apr 11 '14 at 13:06
I am not requesting for the same app 100 times. I am requesting for 100 different apps under the same category. — user3446000, Apr 11 '14 at 13:07
Looks like your DNS resolver is unable to resolve `itunes.apple.com`. Can you run `dig itunes.apple.com` at your command line and post the results here? — Thomas Orozco, Apr 11 '14 at 13:11
This isn't a `requests` problem; your error message states your DNS server isn't able to resolve the name `itunes.apple.com`. — Martijn Pieters, Apr 11 '14 at 13:17
; <<>> DiG 9.8.1-P1 <<>> itunes.apple.com ;; global options: +cmd ;; Got answer: ;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 59154 ;; flags: qr rd ra; QUERY: 1, ANSWER: 4, AUTHORITY: 8, ADDITIONAL: 8 ;; QUESTION SECTION: ;itunes.apple.com. IN A ;; ANSWER SECTION: itunes.apple.com. 5 IN CNAME itunes-cdn.apple.com.akadns.net. itunes-cdn.apple.com.akadns.net. 5 IN CNAME itunes.apple.com.edgekey.net. itunes.apple.com.edgekey.net. 5 IN CNAME e673.e9.akamaiedge.net. e673.e9.akamaiedge.net. 5 IN A 23.58.18.217 — user3446000, Apr 11 '14 at 13:20
;; AUTHORITY SECTION: e9.akamaiedge.net. 5 IN NS a1e9.akamaiedge.net. e9.akamaiedge.net. 5 IN NS n0e9.akamaiedge.net. e9.akamaiedge.net. 5 IN NS n1e9.akamaiedge.net. e9.akamaiedge.net. 5 IN NS n2e9.akamaiedge.net. e9.akamaiedge.net. 5 IN NS n3e9.akamaiedge.net. e9.akamaiedge.net. 5 IN NS n4e9.akamaiedge.net. e9.akamaiedge.net. 5 IN NS n5e9.akamaiedge.net. e9.akamaiedge.net. 5 IN NS a0e9.akamaiedge.net. — user3446000, Apr 11 '14 at 13:21
;; ADDITIONAL SECTION: a0e9.akamaiedge.net. 5 IN AAAA 2a02:26f0:32:f000:f508:4182:8bda:dd8a a1e9.akamaiedge.net. 5 IN AAAA 2600:1417:11:f000:9207:4182:8bda:dd8a n0e9.akamaiedge.net. 5 IN A 88.221.81.194 n1e9.akamaiedge.net. 5 IN A 61.213.146.7 n2e9.akamaiedge.net. 5 IN A 61.213.146.9 n3e9.akamaiedge.net. 5 IN A 88.221.81.195 n4e9.akamaiedge.net. 5 IN A 88.221.81.192 n5e9.akamaiedge.net. 5 IN A 88.221.81.192 ;; Query time: 3594 msec ;; SERVER: 127.0.0.1#53(127.0.0.1) ;; WHEN: Fri Apr 11 18:47:21 2014 ;; MSG SIZE rcvd: 471 — user3446000, Apr 11 '14 at 13:21
when I try the range with (0,2) it works but when I put the range in 100's it shows this error. if the error was with resolving itunes.apple.com then it wouldn't work for range(0,2) — user3446000, Apr 11 '14 at 13:33
I had that error, I solved it by changing the 9150 port to 9050 — JinSnow, Nov 11 '17 at 21:15

score 279 · Answer 1 · edited Jan 09 '23 at 22:03

279

Just use requests features:

import requests
from requests.adapters import HTTPAdapter
from urllib3.util.retry import Retry


session = requests.Session()
retry = Retry(connect=3, backoff_factor=0.5)
adapter = HTTPAdapter(max_retries=retry)
session.mount('http://', adapter)
session.mount('https://', adapter)

session.get(url)

This will GET the URL and retry 3 times in case of requests.exceptions.ConnectionError. backoff_factor will help to apply delays between attempts to avoid failing again in case of periodic request quota.

Take a look at urllib3.util.retry.Retry, it has many options to simplify retries.

edited Jan 09 '23 at 22:03

Nikola Kirincic

3,651
1
24
28

answered Nov 24 '17 at 14:10

Zulu

8,765
9
49
56

3

For whatever reason, this doesn't work on windows 10. Started the shell with `python manage.py shell` and am using `session.get('http://localhost:8000/api/')`. Any help? @Zulu – MwamiTovi Nov 23 '19 at 10:49
1

got my issue sorted. Had forgotten to start the `dev-server` and keep it running first. – MwamiTovi Nov 23 '19 at 11:11
I tried this but it would not retry while I got requests.exceptions.ConnectionError Read timed out. but I set a timeout for the get request. – Zagfai Apr 24 '20 at 14:09
This has been attempting to run for over 5 minutes. Does it ever time out and report a failure? Or is the ```max_retries=retry``` code meaning never stop trying? – Nick Nov 13 '21 at 03:17
1

@Nick As specified in doc, there is the arg `total`which specify the number of attempt. – Zulu Nov 13 '21 at 15:37
Use the `Retry` from `requests` module, Have a look at this answer https://stackoverflow.com/a/35504626 – Akshay Chandran Aug 22 '22 at 05:15

score 210 · Answer 2 · edited Jan 14 '22 at 14:25

210

What happened here is that itunes server refuses your connection (you're sending too many requests from same ip address in short period of time)

Max retries exceeded with url: /in/app/adobe-reader/id469337564?mt=8

error trace is misleading it should be something like "No connection could be made because the target machine actively refused it".

There is an issue at about python.requests lib at Github, check it out here

To overcome this issue (not so much an issue as it is misleading debug trace) you should catch connection related exceptions like so:

try:
    page1 = requests.get(ap)
except requests.exceptions.ConnectionError:
    r.status_code = "Connection refused"

Another way to overcome this problem is if you use enough time gap to send requests to server this can be achieved by sleep(timeinsec) function in python (don't forget to import sleep)

from time import sleep

All in all requests is awesome python lib, hope that solves your problem.

edited Jan 14 '22 at 14:25

Community

1
1

answered Jul 22 '14 at 22:55

djra

4,423
2
16
14

3

The sleep loop fixed my problem - a bit of a hack, but by looping a couple of times while handling the error response, I was able to brute force a solution. – elPastor Mar 29 '17 at 00:20
34

This answer is actually wrong. This is a resolver lookup issue, as indicated by the `(Caused by : [Errno -2] Name or service not known)` part. "gai" stands for `getaddrinfo`, and the probable related error is: **EAI_NONAME** The node or service is not known; or both node and service are NULL; or AI_NUMERICSERV was specified in hints.ai_flags and service was not a numeric port-number string. It probably looked like the sleep fixed it, but you probably just slept through a transient DNS resolver issue. – lingfish Jun 06 '17 at 23:13
5

This answer does not seem to make sense as in 'r' is the object that comes from requests.get() so with the exception this just leads to another error. – mikkokotila May 18 '18 at 08:19
1

This answer doesn't make sense. OP's error doesn't say "Connection refused", it says "Name or service not known". This answer seems to assume that all ConnectionError are due to "Connection refused". – erjiang May 08 '19 at 21:06
1

For me this has to be exactly right, a rate limit placed by the server. I can make 80 calls and then this message will appear for me. Then after a short time, the server is available for another 80 calls and the cycle repeats. it is too regular to be anything else. – demongolem Apr 15 '20 at 23:33

score 49 · Answer 3 · edited Mar 22 '18 at 17:00

49

Just do this,

Paste the following code in place of page = requests.get(url):

import time

page = ''
while page == '':
    try:
        page = requests.get(url)
        break
    except:
        print("Connection refused by the server..")
        print("Let me sleep for 5 seconds")
        print("ZZzzzz...")
        time.sleep(5)
        print("Was a nice sleep, now let me continue...")
        continue

You're welcome :)

edited Mar 22 '18 at 17:00

Dobes Vandermeer

8,463
5
43
46

answered Mar 09 '17 at 09:00

jatin

635
7
8

4

remember to do `import time` – Yuan Tao Jul 21 '17 at 03:29
9

`requests` has his own code to handle its error and retry – Zulu Nov 24 '17 at 14:03
11

It never exits from the loop. @jatin – alper Jan 15 '18 at 18:18
17

Also, not a good idea to just catch any type of exception (with `except: ...`) from `requests` and `sleep()` in response. Instead, they should catch `requests.exceptions.ConnectionError` and `sleep()` only if that exception occurs. (Or better yet, just use the builtin `Retry()` class that comes with `requests`, as suggested by @Zulu). – J. Taylor May 09 '18 at 04:03
time, import time, to win back the request again – Ashik Apr 22 '21 at 17:18
The `continue` skips one loop. Not ideal if you want to retry the request. – Domenico Spidy Tamburro Jul 07 '23 at 08:50

score 48 · Answer 4 · answered May 01 '19 at 18:15

48

I got similar problem but the following code worked for me.

url = <some REST url>    
page = requests.get(url, verify=False)

"verify=False" disables SSL verification. Try and catch can be added as usual.

answered May 01 '19 at 18:15

Raj Stha

1,043
11
18

3

This could be a good solution but only if you trust target `url` – serfer2 Dec 09 '21 at 14:36
My savior :) !!!! – Préliator May 13 '23 at 16:44

score 42 · Answer 5 · answered Oct 31 '17 at 15:35

42

pip install pyopenssl seemed to solve it for me.

https://github.com/requests/requests/issues/4246

answered Oct 31 '17 at 15:35

Akshar

927
9
7

4

Helped me to find out SSL is my problem – Milad Yarmohammadi Aug 05 '19 at 17:10
Same here =) Thanks! – Rodrigo E. Principe Feb 18 '20 at 15:15

score 11 · Answer 6 · answered Nov 03 '19 at 02:51

11

Specifying the proxy in a corporate environment solved it for me.

page = requests.get("http://www.google.com:80", proxies={"http": "http://111.233.225.166:1234"})

The full error is:

requests.exceptions.ConnectionError: HTTPSConnectionPool(host='www.google.com', port=80): Max retries exceeded with url: / (Caused by NewConnectionError(': Failed to establish a new connection: [WinError 10060] A connection attempt failed because the connected party did not properly respond after a period of time, or established connection failed because connected host has failed to respond'))

answered Nov 03 '19 at 02:51

Jeremy Thompson

61,933
36
195
321

Hello - where can i find my proxy? sorry i am new to all this proxy stuff. thanks – Zack Jan 06 '22 at 16:53
1

@Zach if you're in a Corporate Network then the IT support will know. Or you could look it up yourself: https://superuser.com/a/346376 – Jeremy Thompson Jan 06 '22 at 23:07

score 8 · Answer 7 · answered May 23 '18 at 20:15

It is always good to implement exception handling. It does not only help to avoid unexpected exit of script but can also help to log errors and info notification. When using Python requests I prefer to catch exceptions like this:

    try:
        res = requests.get(adress,timeout=30)
    except requests.ConnectionError as e:
        print("OOPS!! Connection Error. Make sure you are connected to Internet. Technical Details given below.\n")
        print(str(e))            
        renewIPadress()
        continue
    except requests.Timeout as e:
        print("OOPS!! Timeout Error")
        print(str(e))
        renewIPadress()
        continue
    except requests.RequestException as e:
        print("OOPS!! General Error")
        print(str(e))
        renewIPadress()
        continue
    except KeyboardInterrupt:
        print("Someone closed the program")

Here renewIPadress() is a user define function which can change the IP address if it get blocked. You can go without this function.

your solution is nice but how to change `ip-adrress` in python, do you know something about it, then let me know — Haritsinh Gohil, Aug 16 '19 at 12:00
I had used some VPN service IPVanish and Hide My Ass. They are configured using open-vpn and open-vpn have shell command row renewing the IP address. You can call shell or bash command from python. In this way, you can implement it. — Tanmoy Datta, Sep 03 '19 at 14:16

score 4 · Answer 8 · answered Aug 22 '19 at 22:07

4

Adding my own experience for those who are experiencing this in the future. My specific error was

Failed to establish a new connection: [Errno 8] nodename nor servname provided, or not known'

It turns out that this was actually because I had reach the maximum number of open files on my system. It had nothing to do with failed connections, or even a DNS error as indicated.

answered Aug 22 '19 at 22:07

Oded

954
12
16

1

Can you give elaborate? What do you mean by open files? – akkhil Nov 10 '20 at 20:06
By open files I mean open file handles. How to fix it is OS-specific, so just search for "increase max open files" and the OS your system is running. – Oded Nov 14 '20 at 20:19
Exactly my issue when trying to submit hundreds of simultaneous synchronous jobs to an AWS Lambda Function using `requests.post`. For Linux and Max, I found this page useful in raising the OS limit on files: https://wilsonmar.github.io/maximum-limits/ – Iron Pillow Jul 29 '21 at 15:38

score 2 · Answer 9 · answered Aug 05 '19 at 17:34

2

When I was writing a selenium browser test script, I encountered this error when calling driver.quit() before a usage of a JS api call.Remember that quiting webdriver is last thing to do!

answered Aug 05 '19 at 17:34

Saleh

1,819
1
17
44

score 1 · Answer 10 · answered Sep 27 '19 at 02:10

1

i wasn't able to make it work on windows even after installing pyopenssl and trying various python versions (while it worked fine on mac), so i switched to urllib and it works on python 3.6 (from python .org) and 3.7 (anaconda)

import urllib 
from urllib.request import urlopen
html = urlopen("http://pythonscraping.com/pages/page1.html")
contents = html.read()
print(contents)

answered Sep 27 '19 at 02:10

alex

1,757
4
21
32

i'm quite annoyed that things work only if run with Anaconda prompt. – BingLi224 Nov 10 '19 at 17:54

hamza · Answer 11 · 2021-02-10T12:49:47.350

1

just import time and add :

time.sleep(6)

somewhere in the for loop, to avoid sending too many request to the server in a short time. the number 6 means: 6 seconds. keep testing numbers starting from 1, until you reach the minimum seconds that will help to avoid the problem.

edited Feb 10 '21 at 12:49

answered Feb 09 '21 at 22:57

hamza

89
3

score 1 · Answer 12 · answered Jun 25 '21 at 07:44

1

It could be network config issue also. So, for that u need to re-config ur network confgurations.

for Ubuntu : sudo vim /etc/network/interfaces

add 8.8.8.8 in dns-nameserver and save it.

reset ur network : /etc/init.d/networking restart

Now try..

answered Jun 25 '21 at 07:44

Ronak Delvadiya

21
4

score 1 · Answer 13 · answered Aug 16 '22 at 23:03

In my case, I am deploying some docker containers inside the python script and then calling one of the deployed services. Error is fixed when I add some delay before calling the service. I think it needs time to get ready to accept connections.

from time import sleep
#deploy containers
#get URL of the container
sleep(5)
response = requests.get(url,verify=False)
print(response.json())

score 0 · Answer 14 · answered Jul 01 '20 at 06:18

Adding my own experience :

r = requests.get(download_url)

when I tried to download a file specified in the url.

The error was

HTTPSConnectionPool(host, port=443): Max retries exceeded with url (Caused by SSLError(SSLError("bad handshake: Error([('SSL routines', 'tls_process_server_certificate', 'certificate verify failed')])")))

I corrected it by adding verify = False in the function as follows :

r = requests.get(download_url + filename)
open(filename, 'wb').write(r.content)

score 0 · Answer 15 · answered Jul 28 '20 at 17:28

0

Check your network connection. I had this and the VM did not have a proper network connection.

answered Jul 28 '20 at 17:28

Timothy C. Quinn

3,739
1
35
47

score 0 · Answer 16 · answered Aug 24 '20 at 03:17

I had the same error when I run the route in the browser, but in postman, it works fine. It issue with mine was that, there was no / after the route before the query string.

127.0.0.1:5000/api/v1/search/?location=Madina raise the error and removing / after the search worked for me.

score 0 · Answer 17 · answered Sep 26 '21 at 12:01

This happens when you send too many requests to the public IP address of https://itunes.apple.com. It as you can see caused due to some reason which does not allow/block access to the public IP address mapping with https://itunes.apple.com. One better solution is the following python script which calculates the public IP address of any domain and creates that mapping to the /etc/hosts file.

import re
import socket
import subprocess
from typing import Tuple

ENDPOINT = 'https://anydomainname.example.com/'
ENDPOINT = 'https://itunes.apple.com/'

def get_public_ip() -> Tuple[str, str, str]:
    """
    Command to get public_ip address of host machine and endpoint domain
    Returns
    -------
    my_public_ip : str
        Ip address string of host machine.
    end_point_ip_address : str
        Ip address of endpoint domain host.
    end_point_domain : str
        domain name of endpoint.

    """
    # bash_command = """host myip.opendns.com resolver1.opendns.com | \
    #     grep "myip.opendns.com has" | awk '{print $4}'"""
    # bash_command = """curl ifconfig.co"""
    # bash_command = """curl ifconfig.me"""
    bash_command = """ curl icanhazip.com"""
    my_public_ip = subprocess.getoutput(bash_command)
    my_public_ip = re.compile("[0-9.]{4,}").findall(my_public_ip)[0]
    end_point_domain = (
        ENDPOINT.replace("https://", "")
        .replace("http://", "")
        .replace("/", "")
    )
    end_point_ip_address = socket.gethostbyname(end_point_domain)
    return my_public_ip, end_point_ip_address, end_point_domain


def set_etc_host(ip_address: str, domain: str) -> str:
    """
    A function to write mapping of ip_address and domain name in /etc/hosts.
    Ref: https://stackoverflow.com/questions/38302867/how-to-update-etc-hosts-file-in-docker-image-during-docker-build

    Parameters
    ----------
    ip_address : str
        IP address of the domain.
    domain : str
        domain name of endpoint.

    Returns
    -------
    str
        Message to identify success or failure of the operation.

    """
    bash_command = """echo "{}    {}" >> /etc/hosts""".format(ip_address, domain)
    output = subprocess.getoutput(bash_command)
    return output


if __name__ == "__main__":
    my_public_ip, end_point_ip_address, end_point_domain = get_public_ip()
    output = set_etc_host(ip_address=end_point_ip_address, domain=end_point_domain)
    print("My public IP address:", my_public_ip)
    print("ENDPOINT public IP address:", end_point_ip_address)
    print("ENDPOINT Domain Name:", end_point_domain )
    print("Command output:", output)

You can call the above script before running your desired function :)

score 0 · Answer 18 · answered Mar 10 '22 at 13:37

My situation is rather special. I tried the answers above, none of them worked. I suddenly thought whether it has something to do with my Internet proxy? You know, I'm in mainland China, and I can't access sites like google without an internet proxy. Then I turned off my Internet proxy and the problem was solved.

score -3 · Answer 19 · edited Nov 03 '19 at 02:52

-3

Add headers for this request.

headers={
'Referer': 'https://itunes.apple.com',
'User-Agent': 'Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_0) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/75.0.3770.142 Safari/537.36'
}

requests.get(ap, headers=headers)

edited Nov 03 '19 at 02:52

Jeremy Thompson

61,933
36
195
321

answered Jul 29 '19 at 09:53

Michael Yang

1,403
2
18
27

Max retries exceeded with URL in requests

19 Answers19

Linked

Related