31

Phase 2. (a) If the proposer receives a response to its prepare requests (numbered n) from a majority of acceptors, then it sends an accept request to each of those acceptors for a proposal numbered n with a value v, where v is the value of the highest-numbered proposal among the responses, or is any value if the responses reported no proposals.

As mentioned in the paper,

A proposer issues a proposal by sending, to some set of acceptors, a request that the proposal be accepted. (This need not be the same set of acceptors that responded to the initial requests.)"

But as my understanding, if we change Phase 2. (a) to:

If the proposer receives a response to its prepare requests (numbered n) from a majority of acceptors, then it sends an accept request to an arbitrary set of majority acceptors for a proposal numbered n with a value v, where v is the value of the highest-numbered proposal among the responses, or is any value if the responses reported no proposals.

the algorithm will fail, following is an example. Consider that there are totally 3 acceptors ABC. We will use X(n:v,m) to denote the status of acceptor X: proposal n:v is the largest numbered proposal accepted by X where n is the proposal number and v is the value of the proposal, and m is the number of the largest numbered prepare request that X has ever responded.

  1. P1 sends 'prepare 1' to AB
  2. Both AB respond P1 with a promise to not to accept any request numbered smaller than 1. Now the status is: A(-:-,1) B(-:-,1) C(-:-,-)
  3. P1 receives the responses, then gets stuck and runs very slowly
  4. P2 sends 'prepare 100' to AB
  5. Both AB respond P2 with a promise to not to accept any request numbered smaller than 100. Now the status is: A(-:-,100) B(-:-,100) C(-:-,-)
  6. P2 receives the responses, chooses a value b and sends 'accept 100:b' to BC
  7. BC receive and accept the accept request, the status is: A(-:-,100) B(100:b,100) C(100:b,-). Note that proposal 100:b has been chosen.
  8. P1 resumes, chooses value a and sends 'accept 1:a' to BC
  9. B doesn't accept it, but C accepts it because C has never promise anything. Status is: A(-:-,100) B(100:b,100) C(1:a,-). The chosen proposal is abandon, Paxos fails.

Did I miss anything here? Thanks.

lambda
  • 585
  • 6
  • 10
  • 1
    Bravo! I'd give this question multiple up-votes if I could. – Michael Deardeuff May 28 '15 at 05:14
  • yes you are indeed correct and yes it is a bug in the multipaxos implementation I am working on. thanks! – simbo1905 Jun 14 '15 at 08:14
  • Your description is *almost* correct but it's based on an all-too-easy misunderstanding of the uniqueness property of proposal ids/round-numbers. Each proposal id/round number must be unique. Re-use is not allowed or the exact condition you describe is possible. See my answer for further details. – Rakis Apr 24 '16 at 05:50
  • @rakis - I disagree about the claim that proposal numbers have to be unique across proposers. If they are guaranteed to be unique, it ensures liveness of the paxos algorithm. But its not necessary that they are unique. The paxos maintains its safety property when they aren't unique. But it doesn't guarantee liveness when proposal numbers are not unique. In a dynamic environment, where proposers come and go, it's not possible to statically assign unique proposal numbers to proposers. In such an environment, assigning these unique numbers itself becomes a problem of consensus. Also, assigning un – seattlesparty Mar 10 '17 at 02:04
  • 9. C will not accept it, because n=1 is less than n=100. both accepted number and promised number are considered. only accept if – ideawu Apr 20 '20 at 09:53
  • @seattlesparty is it always possible to assign unique numbers to proposers (and hence proposals) if you use Paxos itself to manage the configuration of the cluster. Using Paxos itself to manage cluster reconfigurations is mentioned in the original papers. You only have to statically define the proposer numbers for the genius state of the cluster. For a modern take of non-stop cluster reconfigurations check out “UPaxos” – simbo1905 May 24 '20 at 20:25

8 Answers8

20

You missed something in step 7. When C processes accept 100:b it sets its state to C(100:b,100). By accepting a value the node is also promising to not accept earlier values.


Update. I've been thinking about this all month because I knew the above answer was not absolutely correct.

What's more I looked through several proprietary and open-source paxos implementations and they all had the bug submitted by the OP!

So here's the correct answer, when viewed entirely from Paxos Made Simple:

If the proposer receives a response to its prepare requests (numbered n) from a majority of acceptors, then it sends an accept request to each of those acceptors for a proposal numbered n with a value v, where v is the value of the highest-numbered proposal among the responses, or is any value if the responses reported no proposals. (emphasis mine)

In other words, the proposer can only send Accept messages to acceptors that it has received Promises from for that ballot number.

So, is this a contradiction in Lamport's paper? Right now, I'm saying yes.


If you look at Lamport's paxos proofs he treats an accept as a promise, just as my original answer suggests. But this is not pointed out in Paxos Made Simple. In fact, it appears Lamport took great pains to specify that an accept was not a promise.

The problem is when you combine the weaker portions of both variants; as the OP did and several implementations do. Then you run into this catastrophic bug.

simbo1905
  • 6,321
  • 5
  • 58
  • 86
Michael Deardeuff
  • 10,386
  • 5
  • 51
  • 74
  • Thanks, you're right, if **by accepting a value the node is also promising to not accept earlier values**, the algorithm is correct, but in the paper Lamport didn't mention this requirement, right? – lambda Apr 29 '15 at 16:55
  • 1
    This answer appears to be mentioned in a lecture by Leslie Lamport in 2019 from this point in the video https://youtu.be/8-Bc5Lqgx_c?t=4398 In the same video a member of the audience asks why in the TLA+ spec has an acceptor make a promise upon receiving a 2a message here at https://youtu.be/8-Bc5Lqgx_c?t=4200 – simbo1905 May 27 '20 at 19:33
  • 2
    @simbo1905 that is, in fact, a reference to an email conversation I had with Lamport in 2015 after writing this answer and then spending a week or more digging through open-source and proprietary implementations of paxos – Michael Deardeuff May 27 '20 at 20:08
7

There is certainly no problem with broadcasting the accept request to all acceptors. You don't need to restrict it to just the ones that replied to the original prepare request. You've found a rare case of bad wording in Dr Lamport's writing.

There is, however, a bug in your counterexample. Firstly, the notation is defined like this:

X(n:v,m) to denote the status of acceptor X: proposal n:v is the largest numbered proposal accepted by X

But then in step 7 node C has state C(100:b,-), and then in step 9 it's changed to state C(1:a,-). This is not a valid transition: after accepting 1:a it should remain in state C(100:b,-) since 100:b is still the largest numbered proposal accepted by C.

Note that it's perfectly fine that it accepts 1:a after 100:b, essentially because the network is asynchronous so all messages can be delayed or reordered without breaking anything, so the rest of the world can't tell which proposal was accepted first anyway.

Dave Turner
  • 1,846
  • 16
  • 27
  • I agree. A new P after step 9 must only "see" `C(100:b,_)`. The accepted answer choses not to accept `1:a` by making a promise to `100` when accepting `100:b`. Implementations can send negative acknowledgments containing the highest highest promises to help speed up recovery. If as per the accepted answer we promise `100` upon accepting `100:b` we will reply negatively to the later `1:a` letting the proposal know that it isn't using a number higher than the highest accepted. To me that makes understanding what happened easier to comprehend in logs. This answer is more precise though. – simbo1905 Dec 13 '16 at 10:52
  • True - it's _probably_ pointless to accept `1:a` after `100:b`, so it might make more sense to only accept proposals in increasing order and to send NACKs otherwise, even though this has no bearing on correctness. Remember that the network is asynchronous so even if you send `1:a` and then `100:b` the rest of the world could reasonably see you send them in the other order. – Dave Turner Dec 13 '16 at 11:03
  • Errm. Are you saying that state C(100:b,-) is valid, and C can promise to accept a proposal numbered 1 but when it receives the proposal it has to 'accept' it and then ignore it? – Rich N May 14 '18 at 11:50
  • C doesn't sent any promises at all in the above scenario, so I don't fully understand the question. Having accepted proposal number 100 it cannot then make a promise at proposal number 1, but that's not what's happening here. If it helps, "promise to accept a proposal" isn't the right way to think about it, it's more like "promise to reject all lower-numbered proposals". – Dave Turner May 21 '18 at 16:59
  • @Dave "Having accepted proposal number 100 it cannot then make a promise at proposal number 1" <-- this is not required by the wording in "Paxos Made Simple". – MuchToLearn May 22 '19 at 23:20
  • This should be the accepted answer IMO. Accepting a proposal does not mean the acceptor can replace or forget a higher-valued proposal. – MuchToLearn Jun 18 '19 at 19:38
2

NECROBUMP. Even with the weaker portion of both variants, there is no inconsistency. Let's look at step 9 in the example in the question:

"The state is A(-:-,100) B(100:b,100) C(1:a,-). The chosen proposal is abandon, Paxos fails"

However, at this point all we have is an indeterminate value, since there is no majority accepted value (we must eventually choose 'b' since b was accepted by a majority during step 6.)

In order to continue the protocol, we need new ballots and eventually some newer ballot will be accepted. That ballot must have the value 'b',

PROOF: C will respond with (100, 'b') on any prepare requests since the highest-numbered ballot it accepted is (100, 'b') even if it last accepted a ballot (1, 'a'). B will also respond with (100, 'b'). Hence it is no longer possible to get a majority ballot with any value but 'b'.

Lamport's language is that an acceptor will respond with "The proposal with the highest number less than n that it has accepted, if any"

The accepted answer confuses "highest numbered" with "latest accepted," since the example shows that an acceptor may accept values in decreasing numbered order. In order to completely align with Lamport's protocol, it is necessary for C to remember that it responded to (100, 'b') even if the "latest" accept it has made is (1, 'a').

(That being said I would not be surprised if many implementations don't do this correctly, and hence are vulnerable to this issue.)

Ben Braun
  • 418
  • 4
  • 10
  • An additional point that might be worth mentioning is that if a new proposal were made and A responded with (100,~) and C responded with (1,a) and B did not respond, consensus could instead be achieved on value 'a'. Ben's point stands, there is not inconsistency, but the astute reader may wonder about this point. The earlier rounds failed to gain consensus on 'a' or 'b' but the subsequent rounds could go either way, depending on which peers respond. – Rakis May 04 '16 at 02:24
  • @Rakis I might be misunderstanding you, but in the original scenario given in the question "b" is decided as soon as step 7 occurs (a Learner can learn "b" immediately after this step). It is no longer acceptable after that point for the concensus value to switch to "a." My point is that if a new proposal is made, C can not respond with (1,a), as then inconsistency results. But this is not a problem, since the protocol specifies C must respond with the highest numbered proposal it has accepted (100, b) which, surprisingly, might not be the "last" proposal it has accepted (1, a). – Ben Braun Jul 08 '16 at 17:14
  • I'm at a loss on this one. An acceptor that has accepted (100,b) should never accept a ballot with a lower number like (1,a). In order for an Accept message for (100,b) to be sent, the proposer MUST have already gained a majority of promises so it's nonsensical to override a later step with a prior one, regardless of whether the acceptor has promised or not. However, I can't find this explicitly stated in the paper. Makes me wonder if I'm just missing it though. It shouldn't be possible to prove Paxos correct without a protection against this; as the question illustrates. – Rakis Jul 10 '16 at 03:41
  • 3
    This seems to be a common misunderstanding. Having accepted a proposal, an acceptor may safely accept an earlier one _as long as it breaks no previously-made promises_. The correctness proof doesn't require acceptances to be sent in any particular order; indeed it cannot because it's designed to be safe even if messages are reordered arbitrarily. In reality there is little point in accepting proposals out-of-order, but there's no impact on safety for doing so. – Dave Turner Dec 13 '16 at 18:36
  • Late reply here but I don't really agree with your answer. Surely the safety property is violated if the accepted value (100, 'b') is lost on C, regardless of whether future progress can be made. In PMS though it does seem to be a slight ambiguity with a prior statement, although it does clearly say the proposer has to use the original acceptors for phase 2 which would prevent the issue. – jsravn Nov 18 '21 at 15:50
2

There is indeed an ambiguity in the paper, which is why TLA+ specification, and not the paper should be used for implementing the algorithm.

When accepting a value, an acceptor must once again update its state, namely the most recently promised ballot. This is clear from Paxos TLA+ specification, check out Phase 2b in which acceptor updates maxBal, and compare with Phase 1b where it does the same.

Leslie Lamport handles this question in his recent lecture, where he explains that this is done specifically to allow the set of acceptors to be different from the set of nodes promising the ballot.

Kostja
  • 1,607
  • 10
  • 17
  • Thanks for sharing those lectures. This is probably the best answer for anyone attempting to either write or verify an implementation. – simbo1905 May 26 '20 at 06:43
  • Here is the point in that lecture where Lamport says that `maxBal` is updated when receiving a prepare message: https://youtu.be/8-Bc5Lqgx_c?t=3153 Here is `maxBal` being updated when receiving an accept message: https://youtu.be/8-Bc5Lqgx_c?t=3887 – simbo1905 May 26 '20 at 21:24
0

C can't accept the proposal as it hasn't gone through Phase 1. IOWs for a vale to be accepted by an acceptor, the acceptor has to move through both phases of the protocol.

sparty
  • 1
  • I don't see how you could conclude this from the treatment in "Paxos Made Simple". Phase 1 refers only to an individual proposer getting enough answers from "a set S corresponding to some majority of acceptors". – MuchToLearn May 22 '19 at 23:11
0

if by accepting a value the node is also promising to not accept earlier values, the algorithm is correct, but in the paper Lamport didn't mention this requirement, right?

The above condition is not required. Let's say the highest ballot an acceptor has promised is X. Let's say the incoming accept message has ballot number Y. If Y < X, we know that Y has to be rejected. If Y > X, this means that the acceptor hasn't received a prepare request for Y. This means, we have received an invalid paxos message. In this case, the accept message for Y should be dropped.

The only exception to this is when Y == 0. In this case, issuing a prepare with ballot number 0 doesn't make sense as ballot numbers below 0 are invalid. So, phase 1 can be skipped for ballot 0 and a proposer can directly go to Phase 2. In this case, i.e. when Y == 0, the acceptor can accept the value only if it hasn't accepted a value. This is the same as what you are proposing above, but it is required only in the optimized version of Paxos where Phase 1 can be skipped for Y == 0.

IOWs, the only time an acceptor accepts a value is when Y == X. The only exception is when Y == 0. In that case, the acceptor can accept the value only if it hasn't accepted a value.

0

I agree with most part of Ben Braun's answer.

It's fine for C to accept (1,a), it doesn't change the chosen value. Let's say C accepted (1, a), and we take a look at the accept history from the perspective of a learner.

(100, b) accepted by B and C
(1, a) is accepted by C

(100, b) is chosen since it is accepted by majority of acceptors.

At this point the protocol doesn't need to continue if the learner got complete accept history, unless learners failed or messages to learners are lost. This is where I disagree with Ben Braun's answer.

But the acceptors should keep the accepted proposal with the highest number, in case a new proposal is issued.

update: I also agree with Dave Turner that in reality there is no reason to accept lower numbered proposal. Proposal number is like logical time clock, it's safe to ignore older messages.

0

The ambiguous sentence in Paxos Made Simple is "This need not be the same set of acceptors that responded to the initial requests".

Its actual meaning is "Hi, let me give you a hint here. The algorithm described in this paper can be optimized to eliminate the requirement that the prepare phase and the accept phase must have the same set of acceptors". Note that the algorithm described in Paxos Made Simple is a little different from the one described in The Part-Time Parliament.

However, some people misunderstood that sentence like this: "The algorithm described in this paper does not require that the prepare phase and the accept phase must have the same set of acceptors".