The Hidden Complexity of Scaling WebSockets

Atul Jalan

Atul Jalan

Follow

4 min read

Jan 17, 2025

With the rising demand for sync engines and real-time feature, WebSockets have become a critical component for modern applications. At Compose, WebSockets form the backbone of our service, powering our backend SDKs that enable developers to deliver low-latency interactive applications with just backend code.

But, scaling WebSockets has proven to be far more complex than we expected. Below are some of the most important lessons we've learned along the way.

Handle deployments gracefully

Users should never notice when deployments happen, so WebSocket connections need to persist across deployments. This is a delicate process, and requires robust reconnection logic to deal with unexpected issues. At Compose, we achieve near-zero downtime by following these steps:
  1. Spin up new servers.

  2. Once the new servers are healthy, old servers begin returning 503 Service Unavailable responses to health checks.

  3. After 4 consecutive 503 responses, the load balancer declares the server unhealthy and removes the old servers from the pool. The load balancer health checks every 5 seconds, so this process takes up to 25 seconds.

  4. Old servers send a custom WebSocket close message instructing clients to delay reconnection by a random interval to avoid a reconnection surge.
    • The custom close message lets clients show users a more accurate message during the ~10 second period where the client is disconnected.

    • The random delay helps prevent thundering herd issues where all clients reconnect at once. Clients also double the exponential backoff for deployment-related reconnections to account for unforseen issues.

    • The close message is delayed by 20 seconds to account for the time it takes for the load balancer to shift traffic.

  5. Once all clients disconnect, the old servers shut down completely.

If you're using a managed service like Render or Railway, you should be especially cognizant that client connections are transferred gracefully during deployments.

Many managed services that tout zero-downtime deployments will wait until all outstanding requests are processed before shutting down a server. Since WebSocket connections are persistent, this can lead to situations in which old servers are active for minutes or even hours after a deploy until the managed service forcibly terminates the process.

Establish a consistent message schema

While HTTP comes with built-in routing conventions (GET /user, POST /company, PUT /settings), WebSockets require developers to define their own schema for organizing messages.
At Compose, every WebSocket message starts with a fixed 2-byte type prefix for categorizing messages.
  • It's space-efficient (only 2 bytes), while still scaling to 65,536 different types.

  • It enables clients to reliably slice the type prefix from the message without affecting the rest of the data, since the prefix is always 2 bytes.

  • It gives us a simple method for upgrading our APIs by versioning message types.

const MESSAGE_TYPE_TO_HEADER = {
  RENDER_UI: "aa",
  UPDATE_UI: "ab",
  SHOW_LOADING: "ac",
  RENDER_UI_V2: "ad",
  /* ... */
}

Additionally, we use delimiters to separate different fields inside the message, which is both faster to encode/decode and more memory-efficient than JSON.

const DELIMITER = "|";

function createDelimitedMessage(type: string, args: any[]) {
  return [MESSAGE_TYPE_TO_HEADER[type], ...args].join(DELIMITER);
}

function parseDelimitedMessage(message: string) {
  const [type, ...args] = message.split(DELIMITER);
  return { type, args };
}

We're lucky that our backend and frontend are written in TypeScript, allowing us to share message schemas between the two and ensure that neither falls out of sync.

Detect silent disconnects with heartbeats

Connections can drop unexpectedly without triggering a close event, leading to a situation in which the client thinks they're connected, but actually aren't. To prevent stale connections, implementing a robust heartbeat mechanism is essential.
We send periodic ping/pong messages between client and server and reconnect in cases where the heartbeat isn't received within some interval.
Our server sends a ping message every 30 seconds, and expects a pong response. In cases where the client doesn't receive a ping every 45 seconds, it immediately drops the connection and tries to reconnect. Similarly, the server closes connections that miss pong responses within 45 seconds.

By monitoring heartbeats on both ends, we detect and handle rare cases where the client side network appears functional but the server never receives responses.

Have an HTTP fallback

WebSocket connections can be unexpectedly blocked, especially on restrictive public networks. To mitigate such issues, Compose uses server-sent events (SSE) as a fallback for receiving updates, while HTTP requests handle client-to-server communication.
SSE fallback

Since SSE is HTTP-based, it's much less likely to be blocked, providing a reliable alternative in restricted environments. Plus it still achieves decently low latency, especially compared to short-polling solutions.

Concluding thoughts

There's a whole lot more to scaling WebSockets that we didn't cover here. For example:
  • Lack of standard tooling: While most frameworks include built-in tools for rate limiting, data validation, and error handling, you'll generally have to implement these features on your own for WebSockets.

  • Inability to cache responses: Edge networks make it easy to cache HTTP responses close to users, but there's no standard way to accomplish this with WebSockets.

  • Per-message authentication: Guarding against abuse by ensuring that each message is valid for that user before processing it.

But regardless of the complexity, users expect modern applications to be fast, realtime, and collaborative. And, as of now, there's no better way to achieve that than WebSockets.

At Compose, WebSockets power the entire platform - from the database all the way to the main UI thread. Via our SDKs, developers can generate full web apps from their backend logic. Making sure those apps are fast and performant at scale requires WebSockets. If you're interested in learning more, check out our docs. It takes less than 5 minutes to install the SDK and build your first app.
Improving UI performance by optimizing our debouncer

Improving UI performance by optimizing our debouncer

Peek into the internals of Compose's UI rendering engine.

What's new in Compose

What's new in Compose

Read the changelog to learn about the latest features and improvements.

Subscibe to our developer newsletter to get occasional emails when we publish new articles and updates.

What will you build?

2025 Compose. All rights reserved.