system design · system-design

Design Microsoft Teams (Chat + Presence + Voice/Video)

Channels + persistent chat + presence + voice/video + meetings. Microsoft's most-asked SDI.

hard4hazuregeneralsystem-design
Ask GPTConfidence

Theory

Explanation

Intuition first, formal definition second. Skim the bullets if you already know this; read the prose if you don't.

Slack/Discord clone: persistent channels, threads, mentions, file share, voice + video calls. Multi-tenant SaaS, every enterprise gets isolation. Messages durable; presence ephemeral; meetings stream via SFU.

Chat: messages persisted to per-tenant Cosmos DB partition by channel_id; fan-out via SignalR / WebSocket to online members. Presence: per-user heartbeat to in-memory store; multi-key (chat-available, calling, in-meeting). Meetings: signaling server connects participants; SFU (Selective Forwarding Unit) routes media streams; recording uploaded to blob storage. Files via OneDrive integration.

When to use

Enterprise chat + meeting products. Same pattern as Slack, Zoom Chat.

When not to

Tiny teams, overkill. Pure VoIP, different focus.

flowchart LR
  Client([Client]) --> WS[WebSocket / SignalR]
  WS --> Chat[Chat Service]
  Chat --> DB[(Cosmos DB · per tenant)]
  Client --> Pres[Presence Service]
  Pres --> Redis[(In-memory presence)]
  Client --> Sig[Meeting Signaling]
  Sig --> SFU[SFU Media Server]
  SFU --> Record[(Recording → Blob)]
  Files[File Share] --> OD[OneDrive]

Key insights

  • Channels are the partition key, single channel must stay on single shard for ordering.
  • Presence is best-effort, eventual; never persisted long-term.
  • SFU forwards media without decoding/re-encoding, lower cost than MCU.
  • Recording is async, uploads after meeting ends, post-processed.
  • Per-tenant data isolation is non-negotiable in enterprise, separate DB partitions.