Distributed Systems - Reaching Consensus, Part 2: Raft
@SudhirNakka07|December 29, 2024 (9m ago)6 views
Introduction
This is Part 2 of a two-part series on consensus algorithms. For Part 1 about Paxos, see /2024/paxos.
Raft was introduced with the explicit goal of being more understandable than Paxos while providing equivalent guarantees for building a replicated state machine.
Raft: Simplicity by Design
Motivation and Goals
Raft (Ongaro, Ousterhout, 2013) aims for clarity and practicality. It achieves functionality comparable to Multi-Paxos via a leader-based approach and separation of concerns.
Core Components of Raft
Raft decomposes the problem into three relatively independent subproblems:
- Leader Election: Selecting a single node to coordinate the cluster
- Log Replication: Distributing log entries from the leader to followers
- Safety: Ensuring the system maintains consistency properties
How Raft Works
Roles and Terms
- Leader: Handles client requests and replicates log entries
- Followers: Passive; respond to requests from leaders and candidates
- Candidates: Try to become leader during elections
Time is divided into terms; each term begins with an election. At most one leader can be elected per term.
Leader Election
- Nodes become candidates after randomized election timeouts
- Candidates request votes; a majority grants leadership
- Heartbeats (AppendEntries) from a valid leader reset timeouts, preventing unnecessary elections
Log Replication
- Clients send commands to the leader, which appends them to its log
- The leader replicates entries via AppendEntries RPC to followers
- Once an entry is stored on a majority, it is committed and applied to the state machine
Safety
- Log Matching Property: If two logs contain an entry with the same index and term, the logs are identical up to that entry
- Leader Completeness: A leader has all committed entries from previous terms
- State Machine Safety: Once a server has applied a log entry at a given index to its state machine, no other server will apply a different command for the same index
Membership Changes (Joint Consensus)
- Transition through an intermediate configuration that requires a majority of both old and new configurations
Advantages and Trade-offs of Raft
Advantages:
- Designed for understandability and practical implementation
- Clear leader-based structure simplifies reasoning
- Widely adopted in production systems
Trade-offs:
- Leader-based design can have availability implications during leader failover
- Same fundamental limits as other consensus protocols in asynchronous settings
Return to Part 1: Paxos → /2024/paxos