Understanding consistency in distributed systems

Question

How do you get high-frequency, consistent read/writes on a distributed system? Generally not sure how to conceptualize consistency on large scale systems.

Use case: Prevent a user from performing the same action within a specified time period. In the abuse case, this may be a high frequency operation.

Extended questions: How would I scale this operation up? How do systems like Firestore provide high-availability while also providing consistency? What do Firestore quotas (such as 1 document write per second) tell us about how they may have built their system?

Thanks

Why do you want to "prevent a user from performing the same action within a specified time period"? Are you afraid of costs or performance? — Alex Mamo, Aug 07 '19 at 09:26
Afraid of abuse: Say I'm handing out inventory that is expected to go quickly. The abusing user could grab inventory they should not be permitted to have outside the time duration, which lowers the quality of my service (and value of my business) for remaining users who would not have access. — fionbio, Aug 07 '19 at 11:14

Aleksi · Answer 1 · 2019-08-14T15:57:15.293

GCP's Firestore uses the same technology as Cloud Spanner to ensure consistency at scale. To learn more about Cloud Spanner and its CAP implications take a look at the introduction here:

In terms of CAP, Spanner claims to be both consistent and highly available despite operating over a wide area [...]. Does this mean that Spanner is a CA system as defined by CAP? [...] The purist answer is “no” because partitions can happen and in fact have happened at Google, and during some partitions, Spanner chooses C and forfeits A. It is technically a CP system. However, no system provides 100% availability, so the pragmatic question is whether or not Spanner delivers availability that is so high that most users don't worry about its outages.

Hence, while technically a CP system, Cloud Spanner (and thus also Firestore) is effectively CAP, as its "5 or more nines" availability guarantee is high enough for most users to ignore outages.

How do systems like Firestore provide high-availability while also providing consistency?

First, Google runs its own private global network for services like these, which means they're able to provide much stronger guarantees as opposed to relying on public networks.

Second, these systems utilize synchronized clocks to ensure consistency. In Google's case that boils down to TrueTime, a globally synchronized, GPS and atomic-clock based clock that provides strong time semantics (bounded uncertainty of 7ms) even for transactions happening on the opposite sides of the globe. Clocks make it possible to replace communication with local computation: instead of node N asking another node M whether some property holds, it can deduce the answer based on some information about M from the past together with the current time on N's clock (Liskov91).

Cloud Spanner depends on TrueTime to generate monotonically increasing timestamps. Cloud Spanner uses these timestamps in two ways. First, it uses them as proper timestamps for write transactions without the need for global communication. Second, it uses them as timestamps for strong reads, which enables strong reads to execute in one round of communication, even strong reads that span multiple servers. source

For further theory on how clocks help distributed system design, see Liskov's paper here. For more on Cloud Spanner, I highly recommend these summaries on the original Spanner paper as well as the follow-up paper.

Update: The good news is that you don't need atomic clocks, GPS and private global networks to ensure consistency and high availability. The open-source Spanner-inspired CockroachDB achieves much the same as its predecessor, although in lieu of TrueTime's strong time certainty it has to rely on more coarse-grained and less efficient synchronization as outlined in this fantastic comparison:

A simple statement of the contrast between Spanner and CockroachDB would be: Spanner always waits on writes for a short interval, whereas CockroachDB sometimes waits on reads for a longer interval.

Thanks for connecting the dots on Firestore and Spanner. How would you solve the "Afraid of abuse" case, written in the OP-comments? — fionbio, Aug 09 '19 at 18:36
So after the user has performed a purchase we want to prevent the same user from purchasing again for a specific time period? If that's the case, I would set a user-specific expiry period inside the purchase transaction: this would mean that if a user has managed to make a purchase, there's also an expiry period preventing them from doing any subsequent purchases. For more check out this question regarding [inventory release](https://stackoverflow.com/questions/5120976/best-practice-for-releasing-inventory-in-a-database). — Aleksi, Aug 10 '19 at 07:47
What I don’t understand from your response is solving the consistency problem for a distributed system. I can mark inventory with a consistent count, but what about when distributed and consistency is lost? What options exist other than Spanner? — fionbio, Aug 12 '19 at 11:20
I'm having trouble grasping the exact problem. Maybe try spelling out a step-by-step explanation of the scenario? — Aleksi, Aug 12 '19 at 11:59
Say I have a counter (like inventory), and it may decrement rapidly (like flash sale or ad-server). What are the options to handle this problem outside spanner? — fionbio, Aug 14 '19 at 07:01
Well, outside systems like Spanner I'd image you'd have to compromise either on availability or consistency. Compromising availability means that requests to your service may time out while waiting for the guaranteed, up-to-date inventory state and you end up losing sales. Compromising consistency means you optimistically release your inventory (i.e., sell the same item to more than one customer) and end up having to perform compensative actions, for example cancelling already made sales. Both alternatives have seemingly negative effects on the end-user experience; the trade-off is up to you. — Aleksi, Aug 14 '19 at 16:18
Thanks for working through this with me. I understand the CAP tradeoff, but am not clear how high availability at scale happens in practice. For the case of a highly-available, high-frequency read/write counter, I think: writing + receipt to multiple machines at once (say Cassandra), getting a single machine to handle atomic writes for an allocated key-range (incomplete understanding), or something else. — fionbio, Aug 14 '19 at 20:52

fionbio · Answer 2 · 2022-11-18T16:05:03.707

How do you get high-frequency, consistent read/writes on a distributed system?

So some notions that stem from unavoidable facts of life:

Your work may exceed the capability of a single machine so you split up your work "sharding"
A single machine may die, so writes must go to more than a single place before being acknowledged

Most strongly consistent systems are single leader. Remember raft/paxos are single leader systems.

What often happens is your write goes to the leader, the leader synchronously replicates to 1 or more replicas, and then responds to you upon success. Your reads can be dependent on whether you are willing to accept out-of-date data or not (read from master, or read from replica).

Some examples of single leader systems (and/or within a key range "shard"):

BigTable/HBase
DynamoDB (Paxos)
Datomic
Kafka
MySQL/Postgres
Cassandra LWT (Paxos)
Yugabyte (Raft)
Manhattan/MemoryDB (Ordered distributed logs (single machine))
TiDB (Raft)

Other:

Cassandra R + W > N is not strongly consistent despite claims.
VoltDB uses 2pc for operations that span multiple machines

The answer then is to split up the work and ask the single leader. Failover (read "leader election") is pretty fast in the common case.

How do systems like Firestore provide high-availability while also providing consistency?

Single leader with sharding (auto sharded on BigTable or Colossus).

There is a lot more to say here. Perhaps I'll expand this answer later.

Understanding consistency in distributed systems

2 Answers2