system design · system-design
Design OneDrive (File Sync Across Devices)
File sync, version history, large-file upload, deltas. Microsoft signature SDI.
Theory
Explanation
Intuition first, formal definition second. Skim the bullets if you already know this; read the prose if you don't.
Same delta-sync model as Drive/Dropbox: chunked content-addressed blocks + manifest + sync protocol. OneDrive distinctive: per-block compression + per-tenant ACL + Office integration.
Files stored as content-addressed blocks in Azure Blob. Metadata + manifest in Cosmos DB. Delta query API: client passes opaque cursor; server returns changes since cursor. Office documents have rich coauthoring path via separate session.
When to use
Enterprise file sync, document workflows, backup.
When not to
Personal photo sync (specialized backends).
flowchart LR Client([Client]) --> API[OneDrive API] API --> Meta[(Metadata · Cosmos DB)] API --> Blob[(Block Storage · Azure)] Client -.delta cursor.-> Meta Meta --> Notif[Push · long poll] Notif -.changed.-> OtherDevice[Other Devices] Office[Office App] --> Coauth[Coauthoring Session] Coauth --> Meta
Key insights
- Delta query is the API surface for sync, clients never re-walk full tree.
- Per-tenant isolation: each org gets its own Cosmos partition + Blob container.
- Office Coauthoring uses operational transform / CRDT for live edit, separate from sync path.
- Background lifecycle policy migrates cold blocks to cheaper Azure tier.
- Per-block compression cuts storage 30–40% for text documents.