In-memory cache VS. centralized cache in a distributed system

Question

We're currently looking for the most suitable solution for accessing critical data on a distributed system, and we're considering whether to use in memory caching, versus a centralized cache.

Some information about the data we wish to store/access:

Very small data size
Data is very cold; meaning it barely changes, and only changes when a human changes something in our back office system
Has to be up to date when changed (a few 100ms delay is OK)
Very critical path of our application, requires very high SLA (both in reliability and in response times (no more than 20ms to access))
Data is read from frequently (up to thousands of times per second)

The way we see it is as following -

In memory cache

Pros:

Quicker than network access + serialization
Higher reliability in terms of distribution (if one instance dies, the data still exists on the other instances)

Cons:

Much more complex to code and maintain
Requires notifying instances once a change occurs and need to update each instance seperately + Need to load data on start of each server
Adds a high risk of data inconsistency (one instance having different or outdated data than others)

Centralized cache

For the sake of conversation, we've considered using Redis.

Pros:

Much simpler to maintain
Very reliable, we have a lot of experience working with Redis in a distributed system
Only one place to update
Assures data consistency

Cons:

Single point of failure (This is a big concern for us); even though if we go with this solution, we will deploy a cluster
What happens if cache is flushed for some reason

Karthikeyan Gopall · Accepted Answer · 2016-06-28T08:39:43.417

I don't find any problem in going for a centralized cache using Redis.

Anyway you are going to have a cluster setup so if a master fails slave will take up the position.
If cache is flushed for some reason then you have to build the cache, in the mean time requests will get data from the primary source (DB)
You can enable persistence and load the data persisted in the disk and can get the data in seconds(plug and play). If you think you will have inconsistency then follow the below method.

Even if cache is not available system should work (with delayed time obviously). Meaning app logic should check for cache in redis if it's not there or system itself is not available it should get the value from dB and then populate it to redis and then serve to the client.

In this way even if your redis master and slave are down your application will work fine but with a delay. And also your cache will be up to date.

Hope this helps.

After reading more articles online and considering the pros and cons, we've decided that centralized cache is the better fitting solution for us. — Ron, Jun 28 '16 at 11:25

score 5 · Answer 2 · answered Apr 21 '17 at 11:05

5

Redis is a great option for centralized cache. It's fast and performs great. We are using it to store TBs of data.

answered Apr 21 '17 at 11:05

Arpita Agarwal

51
1
1

score 0 · Answer 3 · answered Sep 12 '19 at 07:17

0

Seems you should use centralized cache, sitting between your DB and App Layers where all DB read/writes pass through the cache with a write-through cache invalidation scheme.

answered Sep 12 '19 at 07:17

scientist.rahul

157
1
7

In-memory cache VS. centralized cache in a distributed system

3 Answers3

Linked