18

I'm using python requests library with sessions:

def _get_session(self):
    if not self.session:
        self.session = requests.Session()
    return self.session

And sometimes I'm getting this warning in my logs:

[2014/May/12 14:40:04 WARNING ] HttpConnectionPool is full, discarding connection: www.ebi.ac.uk

My question is: why this is warning and not an exception?

This is the code responsible for this (from http://pydoc.net/Python/requests/0.8.5/requests.packages.urllib3.connectionpool/):

def _put_conn(self, conn):
    try:
        self.pool.put(conn, block=False)
    except Full:
        # This should never happen if self.block == True
        log.warning("HttpConnectionPool is full, discarding connection: %s"
                    % self.host)

Why this exception is catched here? If it was reraised, I could handle this exception in my code, by creating new session and deleting the old one.

If it's only a warning, does it mean it doesn't affect my results in any way? Can I ignore it? If not, how can I handle this situation?

mnowotka
  • 16,430
  • 18
  • 88
  • 134
  • 1
    did you try setting `self.block` to `True`? – roippi May 13 '14 at 13:44
  • 1
    Do I really want my requests to block? Maybe this warning will disappear but are there any other consequences? There is some reason this is not True by default, right? – mnowotka May 13 '14 at 13:54
  • `:param block: If set to True, no more than **maxsize** connections will be used at a time. When no free connections are available, the call will block until a connection has been released.` – roippi May 13 '14 at 13:57
  • 1
    So, when there are no free connections, you either have to have attempted connections block, or to discard them. Your choice. – roippi May 13 '14 at 13:59
  • 1
    But does it mean, that if it's set to False, more than maxsize connections will be used, so I'm safe anyway? – mnowotka May 13 '14 at 13:59
  • No. If it's set to false, any attempted connections past maxsize are simply discarded on the spot (as shown in your log). – roippi May 13 '14 at 14:00
  • @roippi: **wrong**. If `block==False`, any attempted connections past maxsize **will be normally performed**. The only difference is that those extra connections will not be kept in the pool afterwards, hence "discarded". That's why this is just a warning, not an exception. All connections are made. – MestreLion Mar 17 '21 at 12:37

2 Answers2

16

From Requests docs in http://docs.python-requests.org/en/latest/api/

 class requests.adapters.HTTPAdapter(pool_connections=10, pool_maxsize=10, max_retries=0, pool_block=False)

The built-in HTTP Adapter for urllib3.

Provides a general-case interface for Requests sessions to contact HTTP and HTTPS urls by implementing the Transport Adapter interface. This class will usually be created by the Session class under the covers.

Parameters:

  • pool_connections – The number of urllib3 connection pools to cache.
  • pool_maxsize – The maximum number of connections to save in the pool.
  • max_retries (int) – The maximum number of retries each connection should attempt. Note, this applies only to failed connections and timeouts, never to requests where the server returns a response.
  • pool_block – Whether the connection pool should block for connections.

and a little below, comes an example

import requests
s = requests.Session()
a = requests.adapters.HTTPAdapter(max_retries=3)
s.mount('http://', a)

Try this

a = requests.adapters.HTTPAdapter(pool_connections = N, pool_maxsize = M)

Where N and M are suitable for your program.

maxymoo
  • 35,286
  • 11
  • 92
  • 119
andmart
  • 554
  • 2
  • 10
  • In my case, ideal value for M is infinity :) If I use any specific value of M, then I need to count how many requests had been made in this session and then discard it, right? – mnowotka May 13 '14 at 13:57
  • Are you firing connections in parallel or in sequence? How about close() explicitly Response objects ? – andmart May 13 '14 at 14:15
  • Then, you have to choose how to handle when pool limit is over. Or increase the limit or wait to make new requests. Other approach is look for code that is keeping unnecessary response objects and close them explicitly. – andmart May 13 '14 at 14:20
  • I guess waiting makes more sense here. OK, thanks a lot! – mnowotka May 13 '14 at 14:31
  • I am wondering that is there any formula to calculate the M value? – jerryleooo Mar 04 '15 at 02:04
  • @jerryleooo: a good value to start is the number of worker threads you're using. – MestreLion Mar 17 '21 at 12:39
3

I'd like to clarify some stuff here.

What pool_maxsize argument does is limit the number of TCP connections that can be stored in the connection pool simultaneously. Normally, when you want to execute a HTTP requests, requests will try to take a TCP connection from its connection pool. If there are no available connections, requests will create a new TCP connection, and when it is done making a HTTP request, it will try to put it back in the pool (it will not remember whether the connection was taken from the connection pool or not).

The HttpConnectionPool is full warning being raised in requests code is just an example of a common Python pattern usually paraphrased as it is easier to ask for forgiveness than for permission. It has nothing with dropping TCP connections.

MestreLion
  • 12,698
  • 8
  • 66
  • 57