25

I need an Object Pool, and rather than implement it myself, I thought I would look around for a ready-made and tested Python library.

What I found was plenty of other people looking, but not getting many straight answers, so I have brought it over here to Stack Overflow.

In my case, I have a large number of threads (using the threading module), which need to occasionally call a remote SOAP-based server. They could each establish their own connection to the server, but setting up a socket and completing the authentication process is expensive (it is throttled by the server), so I want to share a pool of connections, creating more only as needed.

If the items to pool were worker subprocesses, I might have chosen multiprocessing.pool, but they are not. If they were worker threads, I might have chosen this implementation, but they are not.

If they were MySQL connections, I might have chosen pysqlpool, but they are not. Similarly the SQLAlchemy Pool is out.

If there was one thread, using a variable number of connections/objects, I would consider this implementation, but I need it to be thread-safe.

I know I could implement this again fairly quickly, but given there are many people looking for it, I thought a canonical answer on Stack Overflow would be nice.

gerry3
  • 21,420
  • 9
  • 66
  • 74
Oddthinking
  • 24,359
  • 19
  • 83
  • 121

4 Answers4

29

It seems to me, from your description, that what you need is a pool of connections, not of objects. For simple thread-safety, just keep the reusable connections in a Queue.Queue instance, call it pool. When a thread instantiates a connection-wrapping object, the object gets its connection via pool.get() (which automaticaly enqueues it to wait if there are no connections currently availabe and dequeues it when a connection's ready for it); when the object's done using its connection, it puts it back in the pool via pool.put.

There's so little universally-required, general-purpose functionality in this, beyond what Queue.Queue already gives you, that it's not surprising no module providing it is well known or popular -- hard to make a module widespread when it has about 6 lines of functional code in all (e.g. to call a user-supplied connection factory to populate the queue either in advance or just-in-time up to some maximum number -- not a big added value generally, anyway). "Thick glue", thickly wrapping the underlying functionality from a standard library module without substantial added value, is an architectural minus, after all;-).

Alex Martelli
  • 854,459
  • 170
  • 1,222
  • 1,395
  • Ah, right, waiting, if there are nothing more in the pool, that's what the list lacks. I thought I was clever with list instead of Queue, but too clever actually. :) – Lennart Regebro Oct 03 '09 at 19:22
  • @Lennart, also no _guarantee_ of thread safety, you may or may not run into problems depending on implementations -- with Queue.Queue, your thread safety is guaranteed. – Alex Martelli Oct 04 '09 at 01:14
  • Python has a thread-safe Queue already built-in? I didn't know that! Yes, that'll speed up the implementation (which I thought would be short, but mainly spent thinking through concurrency issues). Sorry, I didn't understand your distinction about a "pool of connections" versus "pool of objects". I said I wanted "to share a pool of connections", but each of those connections is wrapped up in an object, so it is indeed a pool of objects too. The distinction I was trying to make, though, is that the connection objects were NOT active (unlike multiprocessing.pool.) – Oddthinking Oct 05 '09 at 01:42
  • 2
    @Oddthinking, yep, the `Queue` module in Python's standard library is exactly that -- a threadsafe queue (the base one is LIFO, there are priority and FIFO variants too). As for "what to pool", my point is: pool connections which are as lightly-wrapped or unwrapped as you can, because making the connection is the costly part; wrapping a currently unused connection in a brand-new wrapping object that adds all the trimming you want for one transaction's duration should be cheap and fast in comparison, so, no need to pool the wrappers! – Alex Martelli Oct 05 '09 at 02:35
6

I had a similar problem and I must say Queue.Queue is quite good, however there is a missing piece of the puzzle. The following class helps deal with ensuring the object taken gets returned to the pool. Example is included.

I've allowed 2 ways to use this class, with keyword or encapsulating object with destructor. The with keyword is preferred but if you can't / don't want to use it for some reason (most common is the need for multiple objects from multiple queues) at least you have an option. Standard disclaimers about destructor not being called apply if you choose to use that method.

Hopes this helps someone with the same problem as the OP and myself.

class qObj():
  _q = None
  o = None

  def __init__(self, dQ, autoGet = False):
      self._q = dQ

      if autoGet == True:
          self.o = self._q.get()

  def __enter__(self):
      if self.o == None:
          self.o = self._q.get()
          return self.o
      else:
          return self.o 

  def __exit__(self, type, value, traceback):
      if self.o != None:
          self._q.put(self.o)
          self.o = None

  def __del__(self):
      if self.o != None:
          self._q.put(self.o)
          self.o = None


if __name__ == "__main__":
  import Queue

  def testObj(Q):
      someObj = qObj(Q, True)

      print 'Inside func: {0}'.format(someObj.o)

  aQ = Queue.Queue()

  aQ.put("yam")

  with qObj(aQ) as obj:
      print "Inside with: {0}".format(obj)

  print 'Outside with: {0}'.format(aQ.get())

  aQ.put("sam")

  testObj(aQ)

  print 'Outside func: {0}'.format(aQ.get())

  '''
  Expected Output:
  Inside with: yam
  Outside with: yam
  Inside func: sam
  Outside func: sam
  '''
David
  • 133
  • 3
  • 8
0

For simple use cases here is an example implementation of the object pool pattern based on a list:

Source:

https://sourcemaking.com/design_patterns/object_pool

https://sourcemaking.com/design_patterns/object_pool/python/1

"""
Offer a significant performance boost; it is most effective in
situations where the cost of initializing a class instance is high, the
rate of instantiation of a class is high, and the number of
instantiations in use at any one time is low.
"""


class ReusablePool:
    """
    Manage Reusable objects for use by Client objects.
    """

    def __init__(self, size):
        self._reusables = [Reusable() for _ in range(size)]

    def acquire(self):
        return self._reusables.pop()

    def release(self, reusable):
        self._reusables.append(reusable)


class Reusable:
    """
    Collaborate with other objects for a limited amount of time, then
    they are no longer needed for that collaboration.
    """

    pass


def main():
    reusable_pool = ReusablePool(10)
    reusable = reusable_pool.acquire()
    reusable_pool.release(reusable)


if __name__ == "__main__":
    main()
Stefan
  • 10,010
  • 7
  • 61
  • 117
  • Considering a Python Queue rather than a list in ReusablePool, as it is advertised as threadsafe. I am not sure Python lists are on all implementations. – Oddthinking Jan 21 '21 at 23:37
0

You can try one of my open source python object pools.

Pond is a high performance object-pooling library for Python, it has a smaller memory usage and a higher borrow hit rate.

https://github.com/T-baby/pondpond

Andy
  • 1
  • This does not provide an answer to the question. Once you have sufficient [reputation](https://stackoverflow.com/help/whats-reputation) you will be able to [comment on any post](https://stackoverflow.com/help/privileges/comment); instead, [provide answers that don't require clarification from the asker](https://meta.stackexchange.com/questions/214173/why-do-i-need-50-reputation-to-comment-what-can-i-do-instead). - [From Review](/review/late-answers/32867024) – t.o. Oct 10 '22 at 02:55