1

We're using:

  • Standard Redis on Azure
  • StackExchange.Redis
  • RedLock.net

Our website has grown significantly over the last year or two, now serving ~250,000,000 uncached requests per month according to Cloudflare.

Sporadically, we see a couple of hundred exceptions in bursts relating to RedLock not being able to aquire a lock because it is in Conflicted state.

Our Redis cache typically:

  • Runs at 10% server load (I beleive this is regarding CPU)
  • But running close to 100% memory usage

My questions are:

  • Is it recommended practise to have an entirely different Redis server for locking?
  • Could using 100% memory in the Redis server cause issues when creating the locks?
Tom Gullen
  • 61,249
  • 84
  • 283
  • 456

1 Answers1

1

When you look at your cache performance metrics, do the failures coincide with 100% memory usage? If so, I'll bet that's the culprit.

When Redis hits 100% memory, page faulting can occur, which slows down requests. See here for a description of the process. I could envision where a five ms Redlock.net time limit to acquire a lock would expire when memory pressure hits 100% and requests are delayed.

I'd spin up a second Redis server just for locking and see if it alleviates the problem, OR scale up your existing cache. See if you still experience the issue. The scale up would likely be the easiest experiment without having to make changes to your code.

Rob Reagan
  • 7,313
  • 3
  • 20
  • 49
  • Thanks for your answer, this is such a long term issue I think separating into another cache would be the most beneficial step for us as it would shed more light on what's going wrong. Do you know if typical implementations use two redis servers or one? – Tom Gullen Mar 16 '20 at 10:53