Open SystemDecoder on a larger screen to build systems, run simulations, and inject chaos.
What's waiting for you on desktop
Live Simulations
Watch latency spike, queues fill, and nodes fail in real time. Every slider change is instant.
Visual Architecture Canvas
Drag nodes, draw edges, and build any distributed system topology from scratch.
Chaos Engineering
Kill servers, introduce packet loss, throttle CPUs โ and watch your system react.
Real-time Insights
Throughput, p99 latency, error rates โ all charted live as your simulation runs.
40+
Concepts
<1s
Feedback
โ
Replays
"The best way to understand a distributed system
is to break it."
๐ฌ
Hey! Are you there?
Delivered โโ ยท 500ms
Picture this: you send "Hey!" on WhatsApp. Your friend's screen lights up in under a second โ no refresh, no lag, even on a shaky 3G signal. Behind that instant delivery is an architecture that must solve a deceptively hard problem.
The web was built on HTTP: you ask, the server answers. Chat is the opposite โ the server needs to reach out to you the moment something arrives, even when you haven't asked for anything. HTTP alone cannot do this.
Request Flow
The first instinct: just poll
The simplest fix is polling โ your app pings the server every few seconds: "Any new messages?" It works in a demo. It collapses at scale. 99% of those pings come back empty, and at a billion users that's 30 billion wasted requests per minute.
// Phone polls every 3 s:
GET /messages/new โ { messages: [] } // empty
GET /messages/new โ { messages: [] } // empty
GET /messages/new โ { messages: ["Hey!"] } // finally
// 1B users ร 20 polls/min = 20,000,000,000 req/minWhat WhatsApp actually needs
WhatsApp's real-time contract: every message delivered in under 500ms, to any of 1 billion users, on a slow connection, even if the recipient's phone momentarily goes offline. That demands the server push โ not the client poll.
// The goal: send() โ delivered โโ in < 500ms // At 1B users with WebSocket: only active messages create traffic // vs polling: O(users ร poll_rate) = constant massive overhead
Why We Need This
This is a systems problem, not a code problem. Over the next seven steps you'll build the full architecture from scratch โ starting with a naive HTTP approach, adding persistent connections, a message service, queues, caching, offline handling, and finally durable storage.
Key Insight
The real challenge isn't sending a message โ it's guaranteeing delivery to a moving target (a phone that goes online and offline) at billion-user scale with sub-500ms latency.
Overview