Scaling Real-Time Collaboration: Why CRDTs are Non-Negotiable in 2026
Stop fighting race conditions and database corruption in your collaborative apps. Learn how to implement Yjs and WebSockets to build seamless, local-first experiences that actually scale in production.

The Fallacy of Last-Writer-Wins
I once watched a production database melt because three users tried to edit the same JSON blob simultaneously using a 'Last Writer Wins' (LWW) strategy. We were building a project management tool, and a simple task description update turned into a data loss nightmare. One user's 500-word update was wiped out by another user's single-character typo fix because their request arrived 200ms later. If you are still relying on REST PUT requests or naive WebSocket overrides for shared state in 2026, you are building on a foundation of sand.
Real-time collaboration is no longer a 'nice-to-have' feature; it is the baseline expectation. But the 'how' has shifted. We moved from the complexity of Operational Transformation (OT)—which powered Google Docs but required a PhD to implement correctly—to Conflict-free Replicated Data Types (CRDTs). CRDTs allow us to treat distributed state as a mathematical certainty rather than a networking gamble.
Why CRDTs? The Mathematics of Convergence
In a distributed system, latency is a law of physics. You cannot guarantee that User A's edit will reach the server before User B's edit. CRDTs solve this by ensuring that as long as all nodes receive the same set of updates, they will eventually converge to the exact same state, regardless of the order in which those updates arrived.
In 2026, the industry has standardized around Yjs for its performance and Automerge for its developer ergonomics. I prefer Yjs for production systems because its binary encoding is significantly more efficient for large documents. When you're syncing a 5MB document over a 5G connection with 20% packet loss, those bytes matter.
The 2026 Stack: Bun, Hocuspocus, and Yjs
For this implementation, we are using Bun for the runtime (the native WebSocket implementation is roughly 3x faster than Node.js 22), Hocuspocus as the WebSocket backend framework, and Yjs for the CRDT logic. This combination allows us to handle thousands of concurrent connections on a single $20/month VPS.
Implementation: The Frontend Hook
First, let's look at how we initialize a shared document on the client. We want a local-first experience: the user sees their changes instantly, and the synchronization happens in the background.
import * as Y from 'yjs';
import { HocuspocusProvider } from '@hocuspocus/provider';
export const initializeCollaboration = (roomId: string, token: string) => {
const doc = new Y.Doc();
const provider = new HocuspocusProvider({
url: 'wss://api.yourdomain.io',
name: roomId,
document: doc,
token: token,
onConnect: () => console.log('Connected to sync server'),
onStatus: ({ status }) => console.log(`Sync status: ${status}`),
});
// Shared types: Map for metadata, Text for the editor
const sharedContent = doc.getText('content');
const sharedMeta = doc.getMap('metadata');
return { doc, sharedContent, sharedMeta, provider };
};
Implementation: The Backend Gateway
On the server, we need to handle the WebSocket lifecycle, authentication, and persistence. Hocuspocus abstracts the Yjs sync protocol so you don't have to manually handle binary Uint8Array diffs.
import { Server } from '@hocuspocus/server';
import { Redis } from '@hocuspocus/extension-redis';
import { Database } from './db';
const server = Server.configure({
port: 1234,
async onAuthenticate(data) {
const { token } = data;
// Always validate JWTs at the WS handshake level
const user = await validateToken(token);
if (!user) throw new Error('Unauthorized');
return { user };
},
async onLoadDocument(data) {
// Fetch existing state from Postgres/ClickHouse
const row = await Database.getDoc(data.documentName);
if (row) {
// Load the binary update into the Yjs Doc
data.context.update = row.content;
}
return data.context.update;
},
async onStoreDocument(data) {
// Persist the binary state vector
await Database.saveDoc(data.documentName, data.update);
},
extensions: [
new Redis({
host: '127.0.0.1',
port: 6379,
}),
],
});
server.listen();
The Infrastructure Reality: Scaling to 100k Users
WebSockets are stateful. Unlike REST, you cannot just throw a standard Round Robin load balancer in front and call it a day. You need sticky sessions or, better yet, a pub/sub backplane. In the code above, the Redis extension ensures that if User A is connected to Server 1 and User B is connected to Server 2, their changes are broadcasted across the cluster.
In a project I led in 2025, we hit a bottleneck at 15,000 concurrent users. The issue wasn't CPU; it was memory fragmentation. Each Yjs document in memory takes up space. We solved this by implementing an 'LRU Cache' for active documents and aggressively offloading inactive ones to S3 as binary blobs. Don't keep every document in RAM just because it's 'real-time'.
Gotchas: What the Docs Don't Tell You
1. Tombstone Accumulation
CRDTs work by never truly deleting data; they mark it as deleted (a 'tombstone'). If you have a document where users are constantly deleting and re-typing (like a high-traffic chat room), the document size will grow indefinitely. You must periodically 'garbage collect' by squashing the update history into a single snapshot. In Yjs, this means using Y.encodeStateAsUpdate(doc).
2. The Awareness State Overhead
Presence features ('Who is typing', mouse cursors) should not be persisted. Use the Yjs Awareness protocol for this. It's ephemeral and stays in RAM. I've seen developers try to save mouse coordinates to Postgres. Don't be that person. Your database will hate you, and your latency will spike.
3. Binary Protocol Debugging
WebSockets using CRDTs send binary data. Your browser's 'Network' tab will show you gibberish. Use the Yjs Inspector or a custom logging middleware in Hocuspocus to decode these updates during development, or you will spend hours wondering why a document isn't syncing when it's actually just a version mismatch between client and server packages.
Takeaway
Stop building 'real-time' features with 2015-era polling or simple JSON overrides. Transition your stack to a local-first architecture using Yjs and a dedicated WebSocket orchestrator like Hocuspocus. Today, your action item is to audit your most 'collaborative' data model and replace one 'Last Writer Wins' endpoint with a CRDT-backed sync provider. Your users' data integrity depends on it.