4

python-memcached memcache client is written in a way where each thread gets its own connection. This makes python-memcached code simple, which is nice, but presents a problem if your application has hundreds or thousands of threads (or if you run lots of applications), because you will quickly run out of available connections in memcache.

Typically this kind of problem is solved by using a connection pool, and indeed the Java memcache libraries I have seen implement connection pooling. After reading the documentation for various Python memcache libraries it seems the only one offering connection pool is pylibmc, but it has two problems for me: it is not pure Python, and it does not seem to have a timeout for reserving a client from the pool. While not being pure Python is perhaps not a deal breaker, not having a timeout certainly is. It is also not clear how those pools would work with for example dogpile.cache.

Preferably I would like to find a pure Python memcache client with connection pooling that would work with dogpile.cache, but I am open to other suggestions as well. I'd rather avoid changing the application logic, though (like pushing all memcache operations into fewer background threads).

Heikki Toivonen
  • 30,964
  • 11
  • 42
  • 44

2 Answers2

4

A coworker came up with an idea that seems to work well enough for our use case, so sharing that here. The basic idea is that you create the number of memcache clients you want to use up front, put them in a queue, and whenever you need a memcache client you pull one from the queue. Due to Queue.Queue get() method having optional timeout parameter, you can also handle the case where you can't get a client in time. You also need to deal with the use of threading.local in memcache client.

Here is how it could work in code (note that I haven't actually run this exact version so there might be some issues, but this should give you an idea if the textual description did not make sense to you):

import Queue

import memcache

# See http://stackoverflow.com/questions/9539052/python-dynamically-changing-base-classes-at-runtime-how-to 
# Don't inherit client from threading.local so that we can reuse clients in
# different threads
memcache.Client = type('Client', (object,), dict(memcache.Client.__dict__))
# Client.__init__ references local, so need to replace that, too
class Local(object): pass
memcache.local = Local

class PoolClient(object):
    '''Pool of memcache clients that has the same API as memcache.Client'''
    def __init__(self, pool_size, pool_timeout, *args, **kwargs):
        self.pool_timeout = pool_timeout
        self.queue = Queue.Queue()
        for _i in range(pool_size):
            self.queue.put(memcache.Client(*args, **kwargs))

    def __getattr__(self, name):
        return lambda *args, **kw: self._call_client_method(name, *args, **kw)

    def _call_client_method(self, name, *args, **kwargs):
        try:
            client = self.queue.get(timeout=self.pool_timeout)
        except Queue.Empty:
            return

        try:
            return getattr(client, name)(*args, **kwargs)
        finally:
            self.queue.put(client)
Heikki Toivonen
  • 30,964
  • 11
  • 42
  • 44
  • I do not why but my performance in terms of time is decreasing by using pool then using single client – Saurabh Dec 17 '15 at 11:50
0

Many thank to @Heikki Toivenen for providing ideas to the problem! However, I'm not sure how to call the get() method exactly in order to use a memcache client in the PoolClient. Direct calling of get() method with arbitrary name gives AttributeError or MemcachedKeyNoneError.

By combining @Heikki Toivonen's and pylibmc's solution to the problem, I came up with the following code for the problem and posted here for the convenience of future users (I have debugged this code and it should be ready to run):

import Queue, memcache
from contextlib import contextmanager

memcache.Client = type('Client', (object,), dict(memcache.Client.__dict__))
# Client.__init__ references local, so need to replace that, too
class Local(object): pass
memcache.local = Local

class PoolClient(object):
    '''Pool of memcache clients that has the same API as memcache.Client'''
    def __init__(self, pool_size, pool_timeout, *args, **kwargs):
        self.pool_timeout = pool_timeout
        self.queue = Queue.Queue()

        for _i in range(pool_size):
            self.queue.put(memcache.Client(*args, **kwargs))

        print "pool_size:", pool_size, ", Queue_size:", self.queue.qsize()

    @contextmanager
    def reserve( self ):
        ''' Reference: http://sendapatch.se/projects/pylibmc/pooling.html#pylibmc.ClientPool'''

        client = self.queue.get(timeout=self.pool_timeout)

        try:
            yield client
        finally:            
            self.queue.put( client )
            print "Queue_size:", self.queue.qsize()


# Intanlise an instance of PoolClient
mc_client_pool = PoolClient( 5, 0, ['127.0.0.1:11211'] )

# Use a memcache client from the pool of memcache client in your apps
with mc_client_pool.reserve() as mc_client:
    #do your work here
Good Will
  • 1,220
  • 16
  • 10