I'm encountering a problem where data in my database is getting reverted to an old state. I think I have narrowed the problem down to this situation.
Imagine a sequence of two purchases occurring like this:
- All cache nodes are working
- A user logs on (their data is pulled from the DB and stored in memcached)
- A cache node goes down
- The user continues to browse (and since their data cannot be found in the cache it is pulled from the DB and stored in memcached)
- The user performs some action that transforms their record [eg leveling up] (their record is updated in the cache and the database)
- The cache node comes back up
- We pull the user's data from the cache again and it comes from the original cache node that was previously down
- Now we have a problem: the node in the cache is out of date!
- A user makes another action that transforms their record
- This is saved in the cache and the database but since it was based on an out of date record it stomps on the previous change and effectively reverts it
We have now lost data because the database record was re-written over with partially out of date information.
How can I prevent this using PHP5 and libmemcached with persistent connections? I think what I want is for a cache node to not failover at all; it should just fail to read and write to that node but not remove it from the pool so that I don't end up with duplicate records.
This will increase load on my database by 1/n (where n is the total number of cache nodes) when a node goes down but it's better than ending up with inconsistent data.
Unfortunately I'm having trouble understanding what settings I should change to get this behavior.