I have a web tier in AWS running Nginx+PHP-fpm using memcache on ElastiCache for sessions. Over the last 6 months or so we've been experiencing a very strange issue where every so often perhaps 6 weeks or so the ElastiCache node runs out of memory and starts evicting keys which leads to some users being loosing session, being logged out and of course frustrated and loosing their place in the app.
I've tried several things. One being leveraging the php-memcached module in ini:
session.save_handler = memcached
session.save_path = "<aws elasticache dns:port>"
And yes I verified that the save_path url I'm actually using is correct and receiving network connections. I've also verified through CloudWatch metrics that the cache node is indeed receiving network connections and data.
This configuration did not work, so I replaced it with a Zend framework session manager and save handler. I verified through phpinfo()
that session.save_handler
was set to user
and also verified that the browser is getting the right cookie that I configured in Zend session.
Still, we're having the same problem as illustrated in the following CloudWatch screenshot:
The vertical spikes in memory are I believe due to memcache clearing expired keys which seems to happen every 24 hours. The very last (far right) spike is where I rebooted the node. The strange thing is that everytime it clears keys, it doesn't clear enough. We end up with an ultimately downward trend in available memory which at some point causes memory to run out and memcache to start evicting keys.
I'm at a loss as to what could be the problem and what to try next in an effort to debug. Any thoughts? Thanks!