system design · system-design
Design Kafka (Distributed Message Queue)
Partitioning, ISR, leader election, consumer groups. Meta E6+ infra signature.
Theory
Explanation
Intuition first, formal definition second. Skim the bullets if you already know this; read the prose if you don't.
Topic = partitioned append-only log. Each partition has one leader + replicas. Producers append; consumers pull from offsets. Ordering guaranteed within partition, not across. Replicas keep up via ISR (in-sync replicas) protocol; leader election promotes ISR member on failure.
Producer chooses partition (round-robin or by key hash). Sends to leader; leader appends to local log, waits for ISR replicas to fsync. ACK levels: acks=0 (no wait), acks=1 (leader only), acks=all (all ISR). Consumer group: each partition assigned to one consumer in group; consumers commit offsets. KRaft (Kafka Raft) replaces ZooKeeper for control-plane in newer versions. Exactly-once via idempotent producer + transactional commits.
When to use
Event log, async work queues, change data capture, stream processing input.
When not to
Synchronous request-response. Tiny throughput, overkill.
flowchart LR
P[Producer] -->|partition by key| Leader[Partition Leader]
Leader --> Log[(Local Log Segment)]
Leader -.replicate.-> R1[Replica 1]
Leader -.replicate.-> R2[Replica 2]
R1 --> ISR[(ISR Set)]
R2 --> ISR
C1[Consumer 1] -->|pull| Leader
C2[Consumer 2] -->|pull from partition 2| Leader2[Partition 2 Leader]
Controller{{KRaft Controller}} -.metadata + elections.-> LeaderKey insights
- Partition is the unit of parallelism. Pick partition count carefully, re-partitioning is painful.
- ISR is the durability guarantee. acks=all + min.insync.replicas=2 survives 1 broker loss.
- Consumer group rebalance: changing membership pauses consumers briefly while partitions reassign.
- Log compaction enables key-value semantics on top of log (CDC + materialized state).
- KRaft mode (post-2022) removed ZooKeeper, simplified ops.