25

I am trying to understand how three-phase commit avoids blocking

Consider the following two failure scenarios:

Scenario 1: In phase 2 the coordinator sends preCommit messages to all cohorts and has gotten an ack from all except cohort A. Network problems prevent cohort A from receiving the coordinator's preCommit message. Cohort A times out waiting for the preCommit message and chooses to abort. Then both the coordinator and cohort A crash.

Scenario 2: The protocol reaches phase 3. The coordinator sends a doCommit message to cohort A. But before it can send more doCommit messages the coordinator crashes. Cohort A commits its part of the transaction then crashes.

As far as I can tell the remaining cohorts have the exact same state at the end of scenario 1 and scenario 2. So when a recovery coordinator steps in how can it find out from the remaining cohorts whether we are in scenario 1 and abort or we are in scenario 2 and commit and thus avoid blocking?

user782220
  • 10,677
  • 21
  • 72
  • 135
  • This question appears to be off-topic as it is probably better on programmer's stack exchange. – Abizern Jan 29 '14 at 07:49
  • @Abizern the question is a practical, answerable question that is unique to software development. –  Feb 06 '14 at 16:09

4 Answers4

12

Three-phase commit isn't magic; it's just more resilient than two-phase commit. In particular, 3PC is resilient against single-point failure, but not all kinds of multiple-point failure. Both scenarios in the question posit two-point failures. In other words, the premise of the question is misguided; it's asking more of 3PC than it's capable of.

For further reading, here's a sentence from the abstract of a paper on the subject, Analysis and Verification of Two-Phase Commit & Three-Phase Commit Protocols, by Muhammad Atif, to whet your appetite:

We also apply our method to its “amended” variant, the Three-Phase Commit Protocol (3PC) and prove it to be erroneous for simultaneous site failures

I found this paper to provide a foothold into the literature. There's no small amount of it on this subject, if you want to delve in.

Walter Tross
  • 12,237
  • 2
  • 40
  • 64
eh9
  • 7,340
  • 20
  • 43
  • @ eh9 : Looked at the paper and other resources; and still not fully clear why 2PC is blocking when coordinator fails? Is it because in 2PC the cohorts don't employ timeout concept? – KGhatak May 31 '17 at 20:14
  • The block can happen when the coordinator node is also a voter. Let's say that the first phase has completed: all voters have sent their vote to the coordinator, and that non-coordinator nodes have voted to commit. Let's now imagine what happens when the coordinator node fail-stops (crashes). A non-blocking consensus algorithm now has to proceed without this node, but it can't, because it can't distinguish the following situations: the coordinator-voter has committed (and therefore everyone should commit), or the coordinator-voter has rolled back (and therefore everyone should rollback). – gregorias Oct 23 '21 at 08:30
9

In the two-phase commit the coordinator sends a prepare message to all participants (nodes) and waits for their answers. The coordinator then sends their answers to all other sites. Every participant waits for these answers from the coordinator before committing to or aborting the transaction.

The two-phase commit protocol also has limitations in that it is a blocking protocol. For example, participants will block resource processes while waiting for a message from the coordinator. If for any reason this fails, the participant will continue to wait and may never resolve its transaction. Therefore the resource could be blocked indefinitely. On the other hand, a coordinator will also block resources while waiting for replies from participants. In this case, a coordinator can also block in definitely if no acknowledgement is received from the participant.

However, the three-phase protocol introduces a third phase called the pre-commit. The aim of this is to 'remove the uncertainty period for participants that have committed and are waiting for the global abort or commit message from the coordinator.

When receiving a pre-commit message, participants know that all others have voted to commit. 
If a pre-commit message has not been received the participant will abort and release any blocked resources. 
apomene
  • 14,282
  • 9
  • 46
  • 72
6

The thing that helped me understand the non-blocking property was to realize that after the first round of messages both protocols are in essentially the same state; all participants have agreed that they can commit and are waiting for confirmation to do so.

Now, consider what a participant knows after it replied "ok to commit" in the first round.

  • 2PC: another participant may have already received a message to commit and committed or, alternatively, no participants may have done any commits. So if the participant doesn't hear anything it has no idea what the group decision was.
  • 3PC: the participant can be sure that no other participant has performed any commit actions. It's still safe to roll back the entire operation in the case of coordinator failure.

Moving on, after the second round of messages and confirmations in 3PC we are guaranteed that all participants know that the group decision is to commit.

This means that there is never a time in 3PC when a participant does a commit action that another participant is not anticipating.

mrmcgreg
  • 2,754
  • 1
  • 23
  • 26
5

In scenario 1 :

During recovery: All cohorts ,except A, will be in PRECOMMIT state. This tells the recovery node that all cohorts had voted for the commit and moved forward . So A and Coordinator should be in PRECOMMIT state. Since this is a non-final state the transaction is aborted .

In scenario 2 :

During recovery: All cohorts ,except A, will be in PRECOMMIT state. This tells the recovery node that all cohorts had voted for the commit and moved ahead .But since A received the doCommit message it is in COMMITTED state.Had a crash not happened the recovery would ask all cohorts to commit since at least one cohort has committed. Since a crash (A crashed) happened the recovery node sees no live cohort with the committed state and hence it deduces that no cohort got the doCommit message. Hence the transaction will be aborted and all cohorts will be asked to release resources.

when A returns from the crash and begins recovery it will find that all the other cohorts aborted the transaction and it too will abort the transaction .

state chart
(source: swturner at regal.csep.umflint.edu)

Glorfindel
  • 21,988
  • 13
  • 81
  • 109
  • 1
    I don't believe the description for scenario 2 is accurate. The coordinator should have logged persistently that it sent the doCommit message to cohort A. Generally, it is not possible to go from PRECOMMIT to ABORT state (see diagrams). If cohort A recovered after committing and noticed that all others have aborted, the system would be in an inconsistent state because cohort A cannot undo the commit. – Daniel Hoop Jul 01 '21 at 08:23
  • Agree with @DanielHoop. "If a participant fails to receive this message, it commits anyway since it knows from phase 2 that there was a unanimous decision to commit." Other cohorts will do COMMIT event if they don't received from coordinator. – Haoyuan Ge Jun 26 '23 at 09:41