0

Most discussions about two-phase commit only talks about how to commit, but they don't talk about how to update/delete/insert data before commit. Do those operations also involve two phases? Do they require the coordinator to log? What happens if they fail before even reaching the two-phase commit phase? Below are my guess, are they correct?

  1. Update operation will be issued by the client, sent to the coordinator. Then the coordinator dispatches it to the related partition nodes to execute, and wait for their response. After the responses are received, the coordinator is ready to accept the next operation from client. There is only one phase for this operation.

  2. The partition nodes will perform this operation in memory, then log this operation, because it might later undo/redo it in case of failure. But the coordinator does not need to log it at all, since the coordinator does not manage the data and cannot directly undo/redo anything.

  3. If the partition nodes fail to execute it, it will notify the coordinator, then the coordinator will log abort in its entry, and ask all partition nodes involved to abort this transaction. Then all partition nodes rollback the transaction according to the logs.

  4. If all operations goes well, then we enter the two-phase commit. The coordinator send prepare message to the partition nodes, and partition nodes will normally agree (vote yes), log ready and acknowledge the coordinator, then coordinator commit. The reasons that a partition node might disagree (vote no) might be it's running out of resource, aborted by another transaction, fails validation (in case of validation based protocol or snapshot isolation).

I tried searching on the internet, and asking chatGPT. But no convincing answer can be found. I guess reading implementation code will help, but I don't have time to do that.

0 Answers0