memcached slows down website

Question

I have a website that is being served by nginx and django.

My staging.py contains CACHE and middleware settings correctly. You can take a look at nginx.conf and the nginx conf file related to the site. I have confirmed that memcached is running through ngrep -d any port 11211.

I turned on caching for the whole site, and wanted to see the performance by doing ab -n 1000 -c 10 http://site.com

With caching turned off, I get:

Concurrency Level:      10
Time taken for tests:   10.276 seconds
Complete requests:      1000
Failed requests:        0
Write errors:           0
Total transferred:      11695000 bytes
HTML transferred:       11559000 bytes
Requests per second:    97.32 [#/sec] (mean)
Time per request:       102.759 [ms] (mean)
Time per request:       10.276 [ms] (mean, across all concurrent requests)
Transfer rate:          1111.43 [Kbytes/sec] received

With caching turned on, I get:

Concurrency Level:      10
Time taken for tests:   12.277 seconds
Complete requests:      1000
Failed requests:        0
Write errors:           0
Total transferred:      11695000 bytes
HTML transferred:       11559000 bytes
Requests per second:    81.45 [#/sec] (mean)
Time per request:       122.771 [ms] (mean)
Time per request:       12.277 [ms] (mean, across all concurrent requests)
Transfer rate:          930.26 [Kbytes/sec] received

My website is a blog that is pulling posts from a database - nothing exotic.

I'd be grateful if someone could let me know why the site is actually slowing down with memcached. You can see that the "Requests per second" actually drops when I use memcached!

However, running memcached-top gave me no hits when I ran ab (though the read and write counters went up during the test). I have memory available and memcached is not hogging up memory.

EDIT
I ran memcached -vv and got some results. You can see that the the memcached prints out a "STORED" the first time, and then does not seems to send it from the cache (not sure about this). Now I am even more confused. Perhaps the memcached & the django interface is working, but the end result is that it better off to not run memcached?

I am not sure what exactly is the problem here. Did you try seeing the cache hit rate? I thought it might be a good thing to share mintcache with you. http://djangosnippets.org/snippets/155/ — Krishna Bharadwaj, Jan 05 '12 at 08:18
"I'd be grateful if someone could let me know why the site is actually slowing down with memcached. You can see that the "Requests per second" actually drops when I use memcached!" — Trewq, Jan 05 '12 at 12:54
You probably already have checked this, but what is your memory usage when benchmarking? And as @KrishnaBharadwaj argues, check your cache hit rate, for example with `memcache-top` http://code.google.com/p/memcache-top/ — Bouke, Jan 05 '12 at 14:57
@bouke I ran memcached-top and the cache hit rate is ~ 0.1% during the test (update my question with this info) which corroborates to memcached not helping. I have free memory (using free -m). — Trewq, Jan 05 '12 at 21:13
I turned on caching for the whole site - how did you turn it? — dbf, Jan 05 '12 at 21:15
@dbf - I turn it on using the middleware as described in https://docs.djangoproject.com/en/dev/topics/cache/ under "The per-site cache" section. — Trewq, Jan 05 '12 at 22:52
I think it could be usefull to check with debugger: is this middleware really called, what keys it generates, what replies from memcached are received. — dbf, Jan 06 '12 at 10:02
Post your cache and middleware settings. If those seem correct, open up the python shell and fiddle with the memcache api. See if it is working for you. — Brian Neal, Jan 06 '12 at 20:08
@BrianNeal I updated my question -- I ran memcached -vv and got some results. You can see that the the memcached prints out a "STORED" the first time, and then does not seems to send it from the cache (not sure about this). Now I am even more confused. Perhaps the memcached & the django interface is working, but the end result is that it better off to not run memcached? — Trewq, Jan 06 '12 at 21:28
I don't see a `CACHE_MIDDLEWARE_ANONYMOUS_ONLY = True` in your settings file. — Brian Neal, Jan 07 '12 at 00:24

score 1 · Accepted Answer · edited May 23 '17 at 10:27

Trewq, whole lot of different things could be going wrong. You said your machine isn't paging but get requests don't come back even though memcache STORED the result.

My theories: too short of timeouts, bad driver, and possibly wrong CPU arch (x86 vs _64)

Timeouts

Usually in the -vv output ( might be -vvv ) the SET line will have syntax like command, key, value, and a timeout. Very small timeout's might be the problem with memcache storing and then almost immediately flushing the value out.

<command name> <key> <flags> <exptime> <bytes> [noreply]\r\n - https://github.com/memcached/memcached/blob/master/doc/protocol.txt

Driver

Also, there might be an issue with the memcache driver/api you're using as mc should never block that long. You can check your memcache service state by doing something like this http://code.google.com/p/memcached/wiki/NewConfiguringServer#Inspecting_Running_Configuration before and after doing a bechmark run.

Key auditing

A while back I wrote the script in this question Setting smaller buffer size for sys.stdin? to audit the output of memcache -vv to see how balanced GETs were to SETs. It's been a while but I believe it might useful to you with some fixes.

It's not mentioned in the wiki for stat but there are stat values to help you figure out if your cache is balanced - https://github.com/memcached/memcached/blob/master/doc/protocol.txt#L409

Super ideal is 9/10 requests are hits to 1 miss, reality is more likely 6/10 hits to requests, and anything below 60% is wasting memory.

I agree - in a perfect world, your hit ratio would be 90% or greater. But can you substantiate your claim that 60% hit ratio is the threshold of utility? Or is that a subjective cusp based on personal experience? — Chris Tonkinson, Jan 09 '12 at 16:45
@Chris - Personnel experience with a client that had a few 12GB arrays and handled 14-15 million pageviews in a 16 hour "internet" day. When your hit/miss ratio goes below 60% for a HLA environ, this is a sign of inconsistent keying which suggests you are screwing some user somewhere ( degraded/slow service ) - Besides my python script, a good friend of mine wrote this https://github.com/nerdynick/MemcachedManager though a word of warning I don't know if its stable. — David, Jan 09 '12 at 18:20
@Chris - Almost forgot. Another culprit for sub. 60% is too short of expiration times. My approach is usually have a central configuration file with a list of expiration times broken down by category ( global, module, all users, some users, single user ) and to start everything is set to expire a year in the future. — David, Jan 09 '12 at 18:23
That's fair. Thanks for the explanation, I wasn't sure how grounded the 60% claim was, but your argument substantiates it de facto (i.e., if you're 60% or lower, you can surely make adjustments for a higher ratio) — Chris Tonkinson, Jan 10 '12 at 01:21

memcached slows down website

1 Answers1

Timeouts

Driver

Key auditing