Crisp Reading Notes on Latest Technology Trends and Basics

MS-Word notes on the topic

References A good high-level picture Informal reasoning on protocol steps A slightly formal paper with the proofs Potential Algorithm Implementation Some details of use in a Transaction system Paper on Chubby system A File System implementation with Paxos




Paxos is a distributed Consensus algorithm, that

  • Works with a quorum of 2F+1 nodes, and can tolerate F failures


SAN Nodes Establishing Consensus among the various replica nodesUsed for high-throughput Storage servers
CoordinatorsZookeeper, Chubby Establishes Locks and Configuration Servers
OLTP The basic mechanism to ensure consistency on transaction ordering
Network Multicast High throughput Atomic Broadcast

Paxos Structure


There are five original roles in PAXOS, that can be played by different nodes

Client End-user who makes the coordination (or write) request
Proposer One of the nodes, that coordinates a request on behalf of a client
Acceptors A set of nodes that must produce a consensus before the request is accepted
Learners The set of nodes that learn of the value of a Distributed Consensus,so that they can execute some operation based on it.


In reality, these roles are “Logical Roles”. A node may play “Multiple Role les”.

  • In many implementations, each node play three roles of “Proposer”, “Acceptor” and “Learner”.


A Quorum has its usual definition

  • A subset of Acceptors, such that any two subsets share at least one member
  • For 2F+1 nodes, any quorum consists of a majority of nodes – (F+1) nodes

Basic Paxos Algorithm


The following is a high-level working of the PAXOS algorithm

  1. Multiple clients may send values to one proposer each.
  2. Multiple proposers will each propose a value
    • The value could typically be a version number, or an operation
  3. After a round of message exchange, all Acceptors must accept a Single value
  4. The single Accepted value is sent to all the Learners
  5. The computation is executed at the learners
  6. The proposers who lost their proposals can retry the consensus algorithm.

Single Proposer Multiple Acceptors

Under this scenario, a single Proposer, proposes to multiple Acceptors

  1. The Proposer sends a message PROPOSE(N,v) to the acceptors
  2. The Acceptor, on receiving the message
    1. If the Acceptors have not Promised their votes to a proposal numbered higher than N, they send back a PROMISE(N,v) message
    2. If not, they send back a Reject message
  3. The Proposer on receiving the PROMISE message from the Acceptors,
    1. If the majority of Acceptors send the message back, the Proposer goes onto the next step of sending PROPOSE(N,v) to the quorum
    2. Otherwise the proposal is abandoned, and the Proposer starts the process again.
  4. The Acceptors on receiving the ACCEPT message from the Proposers,
    1. If the Acceptor has not Promised a higher value of N, it sends an ACCEPT message to the listener / coordinator
    2. Else it sends a Reject message
  5. The Proposer, on receiving the ACCEPT message,
    1. Sends an INFORM message to the Client
    2. Sends an INFORM message to the Learners

Accounting for Stability

The above protocol has a flaw, in that,

If another Proposer comes in after Stable state is achieved, and it so happens to have a higher Round number,

then, all the Acceptors, as part of Step 2a above, would change their value to the new one, since it has a new Round number.

To avoid this, we need to make sure that, if a system chooses a value, then the value will not change subsequently.

Choosing a value means the majority of (quorum of) the Acceptors Accept a value, then the value is considered chosen.

The value that was propagated by the highest round accepted so far, is a “Chosen Value”.

Modified Stability Algorithm

The following algorithm steps

  1. The proposer sends to a quorum of Acceptors, a proposal – PREPARE(N,V)
    • Where N is a random sequence number, and V is a value
  2. Each Acceptor does the following
    • If the Acceptor has received a higher numbered PREPARE request,
      • then the Acceptor rejects the proposal.
    • Else, it responds with PROMISE(M,v),
      • which is a Promise not to accept any proposal numbered less than N
      • and returns M – the highest round #, that has been accepted by the node, and its value v
  3. When the Proposer receives enough messages (required for a quorum),
    • it issues a PROPOSE(N,v), where
      • N is the round number of this proposer
      • v is the value corresponding to the highest value of M, received in Step-2a
  4. When each Acceptor receives the value – PROPOSE(N,v)
    • If the Acceptor has already accepted a larger Proposal N,
      • It sends a Reject to the coordinator
    • Else, it sends a Accept message

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

Tag Cloud

%d bloggers like this: