Yjs

Anyone who has tried to build real-time collaboration into an app knows the hard part is not the WebSocket. The hard part is what happens when two users edit the same paragraph at the same time while one of them is on a flaky train Wi-Fi. Yjs solves that problem. It is a CRDT (Conflict-Free Replicated Data Type) library that makes every character in a document a unique, mergeable item. Two users can edit the same word simultaneously, go offline, come back 20 minutes later, and their changes merge without losing anything. No server needed to decide who wins.

Notion, TipTap, and several multiplayer coding tools use Yjs in production. It handles the sync math, freeing teams to focus on the editor UI.

How YATA Works (Concrete Example)

YATA (Yet Another Transformation Approach) is the algorithm behind Yjs. It represents a document as a doubly-linked list of items. Every item has a globally unique ID.

Start with an empty document. Two users type simultaneously without seeing each other's edits.

User A (clientID=1) types "Hi":

Item { id: (1, 0), content: "H", originLeft: null, originRight: null }
Item { id: (1, 1), content: "i", originLeft: (1,0), originRight: null }

User B (clientID=2) types "Ok" from the same empty document:

Item { id: (2, 0), content: "O", originLeft: null, originRight: null }
Item { id: (2, 1), content: "k", originLeft: (2,0), originRight: null }

Both users now sync. The question: does the final document read "HiOk" or "OkHi"?

When two items share the same originLeft (both "H" and "O" have originLeft: null, meaning they were both inserted at the start of the document), compare their clientIDs. The lower clientID wins the leftward position.

Since 1 < 2, User A's "H" goes before User B's "O". The merged document reads "HiOk" on both clients. Always. If the clientIDs were reversed, it would be "OkHi" on both clients. The specific order is arbitrary, but it is deterministic. Both clients arrive at the same result without any server telling them what to do.

Each item in the linked list carries: {id: (clientID, clock), content, originLeft, originRight}. The originLeft and originRight pointers record which items existed to the left and right when this item was created. These pointers are what make concurrent inserts resolve correctly even when the document has been heavily edited between the time of insertion and the time of merge.

Deletions do not remove items from the list. They mark items as tombstones. The item stays in the linked list, but it is skipped during rendering. This is why Yjs documents grow over time. A document with 10,000 characters typed and 9,000 deleted still has 10,000 items internally. The 9,000 deleted items are tombstones.

The Sync Protocol

When two Yjs instances connect, they need to figure out what each side is missing. The protocol works in two steps.

Step 1: Exchange state vectors. Each client sends a state vector: a map of {clientID: highestClock} pairs. For example: {1: 5, 2: 12} means "I have seen all operations from client 1 up to clock 5, and all from client 2 up to clock 12."

Step 2: Send missing updates. The other side compares the incoming state vector against its own. It finds the gap: "Client 1's operations 6 through 8 are missing, and client 3 has never been seen." It encodes those missing updates into a binary blob and sends them.

The binary encoding is roughly 10x smaller than a JSON equivalent. A typical character insertion is 5-10 bytes in binary versus 50-100 bytes in JSON. For a document with 100 active users typing continuously, this difference is the line between smooth and laggy.

Applying the same update twice is safe. Each item has a globally unique (clientID, clock) pair. If an update arrives that the client has already seen, it is silently skipped. This idempotency means exactly-once delivery is not required. At-least-once is fine. Duplicates are harmless.

Awareness Protocol

Cursor positions and user presence are handled separately from document sync. This is a deliberate design choice. Cursor movements happen constantly (every keystroke), are ephemeral (nobody cares where a cursor was 5 minutes ago), and would pollute the document's CRDT history if included.

The awareness protocol broadcasts small JSON payloads: {user: "Alice", cursor: {index: 42, length: 0}, color: "#ff0000"}. Each client's awareness state expires after 30 seconds of inactivity. If Alice closes her laptop, her cursor disappears from everyone else's screen within 30 seconds.

This separation keeps the CRDT document clean. The document only contains actual content changes. Cursor data flows through the same WebSocket connection but is not persisted or included in document snapshots.

Production Architecture

A typical production setup looks like this. TipTap runs in the browser with the @tiptap/extension-collaboration extension, which binds a ProseMirror editor to a Yjs document. The Yjs document connects to a Hocuspocus server over WebSocket for relay and persistence. Hocuspocus holds active documents in memory, debounces writes to PostgreSQL (storing binary Yjs snapshots), and uses Redis pub/sub for multi-server fan-out.

Rough capacity planning: a single Hocuspocus server handles about 5,000 concurrent WebSocket connections. With 100 editors per document and 10,000 active documents, that is potentially 1 million connections. A cluster of Hocuspocus instances is needed behind a load balancer with Redis pub/sub coordinating updates across servers. See the Hocuspocus entry for scaling details.

Yjs vs Automerge vs OT

	Yjs	Automerge	OT (Google Docs)
Data model	Linked list (YATA)	OpSet (Lamport timestamps)	Position-based operations
Server required	No	No	Yes (central ordering)
Offline support	Full	Full	Limited
Text editing speed	~1M chars/sec insert	~50K chars/sec	N/A (server-side)
Rich type system	Y.Map, Y.Array, Y.Text	Full JSON-like with schema	Proprietary
Undo support	Per-user UndoManager	Per-user	Global undo stack
Maturity	Battle-tested (TipTap, Notion)	Maturing (v2 rewrite in Rust)	18+ years (Google Docs)

Yjs is significantly faster than Automerge for text editing because YATA's linked-list structure is optimized for sequential inserts (the most common editing pattern). Automerge's OpSet model is more general but pays for it in performance. OT (Operational Transformation) is what Google Docs uses and has the longest production track record, but it requires a central server to order operations. For offline-first or peer-to-peer use cases, OT is not an option.

Document Lifecycle and Garbage Collection

Over time, Yjs documents accumulate metadata. A document edited by 50 users for 6 months can have 500K+ internal items even if the visible text is only 10K characters. Most of those items are tombstones from deleted text plus the version metadata that makes merging work.

Compaction helps. Calling Y.encodeStateAsUpdate(doc) produces a single binary snapshot that represents the current state. Creating a fresh Y.Doc and applying this snapshot works. The new document has the same content but a smaller internal representation because the incremental update history is collapsed.

The gc flag on new Y.Doc({gc: true}) enables tombstone garbage collection. Deleted items are truly removed from the internal structure, saving memory. The tradeoff: the ability to undo those deletes is permanently lost. For most applications, enabling gc after a document has been idle for a few hours is a reasonable policy.

Notion, TipTap, and several multiplayer coding tools use Yjs in production. It handles the sync math, freeing teams to focus on the editor UI.

How YATA Works (Concrete Example)

YATA (Yet Another Transformation Approach) is the algorithm behind Yjs. It represents a document as a doubly-linked list of items. Every item has a globally unique ID.

Start with an empty document. Two users type simultaneously without seeing each other's edits.

User A (clientID=1) types "Hi":

Item { id: (1, 0), content: "H", originLeft: null, originRight: null }
Item { id: (1, 1), content: "i", originLeft: (1,0), originRight: null }

User B (clientID=2) types "Ok" from the same empty document:

Item { id: (2, 0), content: "O", originLeft: null, originRight: null }
Item { id: (2, 1), content: "k", originLeft: (2,0), originRight: null }

Both users now sync. The question: does the final document read "HiOk" or "OkHi"?

The Sync Protocol

When two Yjs instances connect, they need to figure out what each side is missing. The protocol works in two steps.

Awareness Protocol

Production Architecture

Yjs vs Automerge vs OT

	Yjs	Automerge	OT (Google Docs)
Data model	Linked list (YATA)	OpSet (Lamport timestamps)	Position-based operations
Server required	No	No	Yes (central ordering)
Offline support	Full	Full	Limited
Text editing speed	~1M chars/sec insert	~50K chars/sec	N/A (server-side)
Rich type system	Y.Map, Y.Array, Y.Text	Full JSON-like with schema	Proprietary
Undo support	Per-user UndoManager	Per-user	Global undo stack
Maturity	Battle-tested (TipTap, Notion)	Maturing (v2 rewrite in Rust)	18+ years (Google Docs)

Use Cases

Architecture

How YATA Works (Concrete Example)

The Sync Protocol

Awareness Protocol

Production Architecture

Yjs vs Automerge vs OT

Document Lifecycle and Garbage Collection

Pros

Cons

When to use

When NOT to use

Key Points

Common Mistakes

Related Technologies

Yjs

Use Cases

Architecture

How YATA Works (Concrete Example)

The Sync Protocol

Awareness Protocol

Production Architecture

Yjs vs Automerge vs OT

Document Lifecycle and Garbage Collection

Pros

Cons

When to use

When NOT to use

Key Points

Common Mistakes

Related Technologies