7

I'm encountering a problem where data in my database is getting reverted to an old state. I think I have narrowed the problem down to this situation.

Imagine a sequence of two purchases occurring like this:

  • All cache nodes are working
  • A user logs on (their data is pulled from the DB and stored in memcached)
  • A cache node goes down
  • The user continues to browse (and since their data cannot be found in the cache it is pulled from the DB and stored in memcached)
  • The user performs some action that transforms their record [eg leveling up] (their record is updated in the cache and the database)
  • The cache node comes back up
  • We pull the user's data from the cache again and it comes from the original cache node that was previously down
  • Now we have a problem: the node in the cache is out of date!
  • A user makes another action that transforms their record
  • This is saved in the cache and the database but since it was based on an out of date record it stomps on the previous change and effectively reverts it

We have now lost data because the database record was re-written over with partially out of date information.

How can I prevent this using PHP5 and libmemcached with persistent connections? I think what I want is for a cache node to not failover at all; it should just fail to read and write to that node but not remove it from the pool so that I don't end up with duplicate records.

This will increase load on my database by 1/n (where n is the total number of cache nodes) when a node goes down but it's better than ending up with inconsistent data.

Unfortunately I'm having trouble understanding what settings I should change to get this behavior.

Brad Dwyer
  • 6,305
  • 8
  • 48
  • 68
  • A cache node that "comes back up" aka starts, must be considered invalid at all and must by emptied – bebbo Jan 15 '16 at 19:39
  • @bebbo Is there a setting for this in memcached? I'm not sure the cache node knows it has become inaccessible (possibly by only a subset of servers) – Brad Dwyer Jan 18 '16 at 15:11
  • I am not aware of such an options. My approach is, to keep only non transaction data in the cache to be on a safe side. – bebbo Jan 19 '16 at 13:50

2 Answers2

2

I like the versioning and optimistic lock approach implemented in Doctrine ORM. You can do the same. It won't increase load on your database, but will require some refactoring.

Basically, you add a version number to all tables you are caching, change your update queries to increment version version = version + 1 and add where version=$version condition (please note $version comes from your php/memcache). You will need to check number of affected rows, and throw an exception if it is 0.

It is up to you how to handle such exception. You can just invalidate cache for this record, and ask user to re-submit the form, or you can try to merge the changes. At this point you have stale data from the cache, update from the user input, and fresh data from the DB, so the only unrecoverable case is when you have 3 different values for the same column.

Alex Blex
  • 34,704
  • 7
  • 48
  • 75
  • This seems like it should work in theory. Thanks for the suggestion. Unfortunately it does add a lot of complexity – Brad Dwyer Jan 18 '16 at 15:13
1

you are making problem more complex, a simple approach should just mark the cache dirty and rebuild it, not just put it back in service with inconsistent data on it.

Allen
  • 6,505
  • 16
  • 19
  • That doesn't fix the problem; you can't mark it dirty if it is inaccessible. When it comes back up it will still look clean – Brad Dwyer Jan 18 '16 at 15:10
  • @BradDwyer once a client can not hit the cache, mark it in a common accessible location, like a database table with a timestamp, then after the cache backs online, it should have some type of heartbeat script to check the consistency with that database table to see if it should clean it's cache or not. – Allen Jan 18 '16 at 15:48