We're currently looking for the most suitable solution for accessing critical data on a distributed system, and we're considering whether to use in memory caching, versus a centralized cache.
Some information about the data we wish to store/access:
- Very small data size
- Data is very cold; meaning it barely changes, and only changes when a human changes something in our back office system
- Has to be up to date when changed (a few 100ms delay is OK)
- Very critical path of our application, requires very high SLA (both in reliability and in response times (no more than 20ms to access))
- Data is read from frequently (up to thousands of times per second)
The way we see it is as following -
In memory cache
Pros:
- Quicker than network access + serialization
- Higher reliability in terms of distribution (if one instance dies, the data still exists on the other instances)
Cons:
- Much more complex to code and maintain
- Requires notifying instances once a change occurs and need to update each instance seperately + Need to load data on start of each server
- Adds a high risk of data inconsistency (one instance having different or outdated data than others)
Centralized cache
For the sake of conversation, we've considered using Redis.
Pros:
- Much simpler to maintain
- Very reliable, we have a lot of experience working with Redis in a distributed system
- Only one place to update
- Assures data consistency
Cons:
- Single point of failure (This is a big concern for us); even though if we go with this solution, we will deploy a cluster
- What happens if cache is flushed for some reason