68

Python's popular Requests library is said to be thread-safe on its home page, but no further details are given. If I call requests.session(), can I then safely pass this object to multiple threads like so:

session = requests.session()
for i in xrange(thread_count):
    threading.Thread(
        target=target,
        args=(session,),
        kwargs={}
    )

and make requests using the same connection pool in multiple threads?

If so, is this the recommended approach, or should each thread be given its own connection pool? (Assuming the total size of all the individual connection pools summed to the size of what would be one big connection pool, like the one above.) What are the pros and cons of each approach?

DJG
  • 6,413
  • 4
  • 30
  • 51
  • Did you figure out which is better? I'm currently running into nearly the same question. I was thinking a new session for each thread so as to not bottleneck all requests in a single connection pool. – Marcel Wilson Nov 06 '13 at 19:36
  • @Marcel Wilson Not exactly. Although for one of my projects where I was using a session object to request the same URL over and over again, I sent the same session object to all of the threads. The application does seem to work, but I am still not sure what the better approach is. Note, though, that my problem was not with bottlenecking the connection pools, but was instead with opening too many connections and sending too many requests at a time. – DJG Nov 07 '13 at 12:25
  • Requests is built on top of urllib3. The thread-safety of requests is largely due to the thread-safety of urllib3, the doucmentation for which discusses thread safety in greater detail. – selllikesybok Nov 20 '13 at 14:06
  • @dg123 I ended up creating a session in the for loop. Each thread gets it's own connection pool. – Marcel Wilson Dec 12 '13 at 21:52

3 Answers3

33

After reviewing the source of requests.session, I'm going to say the session object might be thread-safe, depending on the implementation of CookieJar being used.

Session.prepare_request reads from self.cookies, and Session.send calls extract_cookies_to_jar(self.cookies, ...), and that calls jar.extract_cookies(...) (jar being self.cookies in this case).

The source for Python 2.7's cookielib acquires a lock (threading.RLock) while it updates the jar, so it appears to be thread-safe. On the other hand, the documentation for cookielib says nothing about thread-safety, so maybe this feature should not be depended on?

UPDATE

If your threads are mutating any attributes of the session object such as headers, proxies, stream, etc. or calling the mount method or using the session with the with statement, etc. then it is not thread-safe.

millerdev
  • 10,011
  • 2
  • 31
  • 27
33

https://github.com/psf/requests/issues/1871 implies that Session is not thread-safe, and that at least one maintainer recommends one Session per thread.

I just opened https://github.com/psf/requests/issues/2766 to clarify the documentation.

Jesse Aldridge
  • 7,991
  • 9
  • 48
  • 75
Greg Ward
  • 1,604
  • 1
  • 13
  • 13
  • 1
    It looks like this depends on `urllib3` being thread safe, which I don't believe it is based on https://github.com/urllib3/urllib3/issues/1252 – Charles L. Sep 04 '19 at 23:58
2

I also faced the same question and went to the source code to find a suitable solution for me. In my opinion Session class generally has various problems.

  1. It initializes the default HTTPAdapter in the constructor and leaks it if you mount another one to 'http' or 'https'.
  2. HTTPAdapter implementation maintains the connection pool, I think it is not something to create on each Session object instantiation.
  3. Session closes HTTPAdapter, thus you can't reuse the connection pool between different Session instances.
  4. Session class doesn't seem to be thread safe according to various discussions.
  5. HTTPAdapter internally uses the urlib3.PoolManager. And I didn't find any obvious problem related to the thread safety in the source code, so I would rather trust the documentation, which says that urlib3 is thread safe.

As the conclusion from the above list I didn't find anything better than overriding Session class

class HttpSession(Session):
    def __init__(self, adapter: HTTPAdapter):
        self.headers = default_headers()
        self.auth = None
        self.proxies = {}
        self.hooks = default_hooks()
        self.params = {}
        self.stream = False
        self.verify = True
        self.cert = None
        self.max_redirects = DEFAULT_REDIRECT_LIMIT
        self.trust_env = True
        self.cookies = cookiejar_from_dict({})
        self.adapters = OrderedDict()
        self.mount('https://', adapter)
        self.mount('http://', adapter)

    def close(self) -> None:
        pass

And creating the connection factory like:

class HttpSessionFactory:
    def __init__(self,
             pool_max_size: int = DEFAULT_CONNECTION_POOL_MAX_SIZE,
             retry: Retry = DEFAULT_RETRY_POLICY):
        self.__http_adapter = HTTPAdapter(pool_maxsize=pool_max_size, max_retries=retry)

    def session(self) -> Session:
        return HttpSession(self.__http_adapter)

    def close(self):
        self.__http_adapter.close()

Finally, somewhere in the code I can write:

with self.__session_factory.session() as session:
    response = session.get(request_url)

And all my session instances will reuse the same connection pool. And somewhere at the end when the application stops I can close the HttpSessionFactory. Hope this will help somebody.

vatuska
  • 21
  • 1
  • 2. I agree with you but I guess the requests library has to make some stricter assumptions about use cases beyond ours. 3. Only if you do close the Session, which is unnecessary precisely for reusable HTTP adapters. Do not use Sessions as context managers because of tradition! – N1ngu Jul 25 '22 at 12:26
  • Overall, your snippet does not address Session thread-unsafety or avoid thread-blocking, which would mainly come, AFAIU, from the cookie jar and the redirection cache. It is simply about reusing the connection pool cross-thread. This is nice but this is not about thread-safety. Also, you might be interested in this simpler approach to the same idea: global adapter instead of a session factory https://stackoverflow.com/a/73094972/11715259 – N1ngu Jul 25 '22 at 21:29
  • > 4. Session class doesn't seem to be thread safe according to various discussions. — which discussions? Could you link some? – greatvovan Aug 03 '22 at 18:40