Event Sourcing

Store What Happened, Not Where You Are

Most systems store current state. A user's balance is $500. A document contains 3,000 words. A shopping cart has 4 items. When something changes, the old value is overwritten with the new one. The previous state is gone.

Event sourcing flips that around. Instead of storing current state, you store every change that led to it. The user deposited $200, then withdrew $50, then received a $350 transfer. The balance is derived by replaying those events. The document had 47 insert operations, 12 delete operations, and 3 formatting changes. The content is derived by applying those operations in order.

The event log is the source of truth. Current state is just a disposable cache that you can rebuild at any time. That matters the moment someone asks "what did this document look like last Tuesday?" or "which edit introduced this bug?" or "can we undo the last 5 operations without affecting other users' work?"

The Append-Only Event Log

Every state change produces an event: an immutable fact about what happened, with a type, a timestamp, a payload, and a monotonically increasing sequence number.

Event #1: { type: "TextInserted", position: 0, text: "Hello", userId: "alice", clock: 1 }
Event #2: { type: "TextInserted", position: 5, text: " world", userId: "bob", clock: 1 }
Event #3: { type: "TextDeleted", position: 5, length: 6, userId: "alice", clock: 2 }
Event #4: { type: "TextInserted", position: 5, text: " everyone", userId: "alice", clock: 3 }

Current state after replaying all four events: "Hello everyone". But the log preserves the full history: Bob's " world" was added and then Alice replaced it. No information is lost.

Events are only appended, never modified or deleted. The log stays honest because nobody can rewrite history. If an event says Alice deleted text at position 5, that fact is permanent. Even if the deletion is later undone, the undo itself is a new event, not a removal of the original.

In a collaborative editor, every keystroke, formatting change, cursor move, and comment creates an event. The Yjs operation log in PostgreSQL is exactly this: an append-only sequence of binary CRDT updates, partitioned by document ID.

Projections: Deriving State from Events

Raw event logs are not directly queryable. Scanning 50,000 events to answer "show me the current document content" every time a user opens the editor is not realistic.

Projections handle this. Think of a projection as a materialized view built by processing events in order. Each time a new event arrives, the projection handler applies it and updates the stored result.

In a collaborative editor, the Yjs document in memory IS a projection. It is the result of applying every CRDT operation in sequence. The document snapshot stored in PostgreSQL is a serialized projection. Elasticsearch search indexes are projections built from document content events.

You can run multiple projections from the same event log, each shaped for a different access pattern. The current document is one projection. Edit counts per user for analytics is another. A revision history timeline is a third. Same source data, three completely different views.

If a projection becomes corrupted or its schema changes, just rebuild it by replaying the event log. Projections are disposable. You can always recreate them.

Snapshots: Bounding Rebuild Time

Replaying 500,000 events to reconstruct a document takes time. For a busy collaborative document, that could mean seconds of delay on every cold load.

The fix is snapshots. Periodically, serialize the current state and store it alongside the event log. To rebuild, load the most recent snapshot and replay only the events that came after it.

A collaborative editor snapshots every 500 operations (with a cooldown). If the document has 50,000 total operations and the last snapshot was at operation 49,800, loading the document means: load the snapshot (one read) and replay 200 operations (fast). Without the snapshot, all 50,000 operations would need to be replayed.

Same idea as PostgreSQL WAL checkpoints. The snapshot is the checkpoint. The event log is the WAL. Recovery replays from the last checkpoint forward.

The tradeoff is storage. Each snapshot is a full copy of the state. For a collaborative editor, a Yjs document snapshot is typically 1.5-3x the plain text size. With snapshots every 500 operations, a document with 50,000 operations has roughly 100 snapshots. Old snapshots can be archived to cold storage (S3) or pruned, keeping only the most recent plus any user-named versions.

CQRS: Separating Reads from Writes

Once you have an event log on the write side and projections on the read side, you are already doing CQRS whether you planned to or not. Writes append events. Reads query projections.

In a collaborative editor:

Write path: User types a character → Yjs produces a CRDT operation → operation is appended to the event log → broadcast to other clients via WebSocket
Read path: Document content is served from the in-memory Yjs projection. Search queries hit Elasticsearch projections. Version history reads from the snapshot store.

The two paths scale independently. Write throughput depends on the event log's append rate (PostgreSQL batch inserts). Read throughput depends on how projections are served (Redis for hot state, Elasticsearch for search, S3 for version history).

Schema Evolution

Events are immutable, but your system is not. New features add new event types, and old event types eventually need new fields. You cannot change what is already stored, so you need a migration strategy.

Three approaches work in practice:

Upcasting: Transform old events into the current schema at read time. When replaying event #1 (version 1 schema), an upcaster adds any missing fields with default values before the projection handler processes it. The stored event stays unchanged.

Versioned handlers: The projection handler checks the event version and applies different logic for each. TextInserted_v1 has position and text. TextInserted_v2 adds formatting marks. The handler processes both.

New event types: Instead of modifying TextInserted, introduce FormattedTextInserted as a new type. Old events remain as-is. New events use the new type. Projection handlers process both.

The worst approach is migrating old events in place. This breaks the immutability guarantee and can cause subtle bugs when replaying across schema boundaries.

When Event Sourcing is Overkill

None of this comes free. You are signing up for an append-only log, projections, snapshot management, schema evolution, and eventual consistency between the log and the projections. That is real operational overhead.

For simple CRUD applications where you never need to know "what happened before," storing current state directly is simpler, faster, and cheaper. A user profile that changes email addresses does not need to preserve every previous email in an event log.

Event sourcing pays off when:

Audit trails are required (financial systems, healthcare, compliance)
Temporal queries matter ("show me the document as it was at 3pm yesterday")
Undo/redo is a core feature (collaborative editors, design tools)
Debugging requires replay ("what sequence of events caused this state?")
Multiple read models serve different access patterns from the same data

Collaborative editors are where event sourcing feels most at home. The operation log is already an event log. Snapshots are already projections. Version history is already temporal querying. Undo is already event reversal. You do not bolt the pattern on. It is how the system works from day one.

Store What Happened, Not Where You Are

The Append-Only Event Log

Every state change produces an event: an immutable fact about what happened, with a type, a timestamp, a payload, and a monotonically increasing sequence number.

Event #1: { type: "TextInserted", position: 0, text: "Hello", userId: "alice", clock: 1 }
Event #2: { type: "TextInserted", position: 5, text: " world", userId: "bob", clock: 1 }
Event #3: { type: "TextDeleted", position: 5, length: 6, userId: "alice", clock: 2 }
Event #4: { type: "TextInserted", position: 5, text: " everyone", userId: "alice", clock: 3 }

Current state after replaying all four events: "Hello everyone". But the log preserves the full history: Bob's " world" was added and then Alice replaced it. No information is lost.

Projections: Deriving State from Events

Raw event logs are not directly queryable. Scanning 50,000 events to answer "show me the current document content" every time a user opens the editor is not realistic.

If a projection becomes corrupted or its schema changes, just rebuild it by replaying the event log. Projections are disposable. You can always recreate them.

Snapshots: Bounding Rebuild Time

Replaying 500,000 events to reconstruct a document takes time. For a busy collaborative document, that could mean seconds of delay on every cold load.

The fix is snapshots. Periodically, serialize the current state and store it alongside the event log. To rebuild, load the most recent snapshot and replay only the events that came after it.

Same idea as PostgreSQL WAL checkpoints. The snapshot is the checkpoint. The event log is the WAL. Recovery replays from the last checkpoint forward.

CQRS: Separating Reads from Writes

Once you have an event log on the write side and projections on the read side, you are already doing CQRS whether you planned to or not. Writes append events. Reads query projections.

In a collaborative editor:

Write path: User types a character → Yjs produces a CRDT operation → operation is appended to the event log → broadcast to other clients via WebSocket
Read path: Document content is served from the in-memory Yjs projection. Search queries hit Elasticsearch projections. Version history reads from the snapshot store.

Schema Evolution

Three approaches work in practice:

New event types: Instead of modifying TextInserted, introduce FormattedTextInserted as a new type. Old events remain as-is. New events use the new type. Projection handlers process both.

The worst approach is migrating old events in place. This breaks the immutability guarantee and can cause subtle bugs when replaying across schema boundaries.

When Event Sourcing is Overkill

Event sourcing pays off when:

Audit trails are required (financial systems, healthcare, compliance)
Temporal queries matter ("show me the document as it was at 3pm yesterday")
Undo/redo is a core feature (collaborative editors, design tools)
Debugging requires replay ("what sequence of events caused this state?")
Multiple read models serve different access patterns from the same data

Architecture

Store What Happened, Not Where You Are

The Append-Only Event Log

Projections: Deriving State from Events

Snapshots: Bounding Rebuild Time

CQRS: Separating Reads from Writes

Schema Evolution

When Event Sourcing is Overkill

Key Points

Used By

Common Mistakes

Related

Event Sourcing

Architecture

Store What Happened, Not Where You Are

The Append-Only Event Log

Projections: Deriving State from Events

Snapshots: Bounding Rebuild Time

CQRS: Separating Reads from Writes

Schema Evolution

When Event Sourcing is Overkill

Key Points

Used By

Common Mistakes

Related