Event Sourcing
Architecture
Store What Happened, Not Where You Are
Most systems store current state. A user's balance is $500. A document contains 3,000 words. A shopping cart has 4 items. When something changes, the old value is overwritten with the new one. The previous state is gone.
Event sourcing flips that around. Instead of storing current state, you store every change that led to it. The user deposited $200, then withdrew $50, then received a $350 transfer. The balance is derived by replaying those events. The document had 47 insert operations, 12 delete operations, and 3 formatting changes. The content is derived by applying those operations in order.
The event log is the source of truth. Current state is just a disposable cache that you can rebuild at any time. That matters the moment someone asks "what did this document look like last Tuesday?" or "which edit introduced this bug?" or "can we undo the last 5 operations without affecting other users' work?"
The Append-Only Event Log
Every state change produces an event: an immutable fact about what happened, with a type, a timestamp, a payload, and a monotonically increasing sequence number.
Event #1: { type: "TextInserted", position: 0, text: "Hello", userId: "alice", clock: 1 }
Event #2: { type: "TextInserted", position: 5, text: " world", userId: "bob", clock: 1 }
Event #3: { type: "TextDeleted", position: 5, length: 6, userId: "alice", clock: 2 }
Event #4: { type: "TextInserted", position: 5, text: " everyone", userId: "alice", clock: 3 }
Current state after replaying all four events: "Hello everyone". But the log preserves the full history: Bob's " world" was added and then Alice replaced it. No information is lost.
Events are only appended, never modified or deleted. The log stays honest because nobody can rewrite history. If an event says Alice deleted text at position 5, that fact is permanent. Even if the deletion is later undone, the undo itself is a new event, not a removal of the original.
In a collaborative editor, every keystroke, formatting change, cursor move, and comment creates an event. The Yjs operation log in PostgreSQL is exactly this: an append-only sequence of binary CRDT updates, partitioned by document ID.
Projections: Deriving State from Events
Raw event logs are not directly queryable. Scanning 50,000 events to answer "show me the current document content" every time a user opens the editor is not realistic.
Projections handle this. Think of a projection as a materialized view built by processing events in order. Each time a new event arrives, the projection handler applies it and updates the stored result.
In a collaborative editor, the Yjs document in memory IS a projection. It is the result of applying every CRDT operation in sequence. The document snapshot stored in PostgreSQL is a serialized projection. Elasticsearch search indexes are projections built from document content events.
You can run multiple projections from the same event log, each shaped for a different access pattern. The current document is one projection. Edit counts per user for analytics is another. A revision history timeline is a third. Same source data, three completely different views.
If a projection becomes corrupted or its schema changes, just rebuild it by replaying the event log. Projections are disposable. You can always recreate them.
Snapshots: Bounding Rebuild Time
Replaying 500,000 events to reconstruct a document takes time. For a busy collaborative document, that could mean seconds of delay on every cold load.
The fix is snapshots. Periodically, serialize the current state and store it alongside the event log. To rebuild, load the most recent snapshot and replay only the events that came after it.
A collaborative editor snapshots every 500 operations (with a cooldown). If the document has 50,000 total operations and the last snapshot was at operation 49,800, loading the document means: load the snapshot (one read) and replay 200 operations (fast). Without the snapshot, all 50,000 operations would need to be replayed.
Same idea as PostgreSQL WAL checkpoints. The snapshot is the checkpoint. The event log is the WAL. Recovery replays from the last checkpoint forward.
The tradeoff is storage. Each snapshot is a full copy of the state. For a collaborative editor, a Yjs document snapshot is typically 1.5-3x the plain text size. With snapshots every 500 operations, a document with 50,000 operations has roughly 100 snapshots. Old snapshots can be archived to cold storage (S3) or pruned, keeping only the most recent plus any user-named versions.
CQRS: Separating Reads from Writes
Once you have an event log on the write side and projections on the read side, you are already doing CQRS whether you planned to or not. Writes append events. Reads query projections.
In a collaborative editor:
- Write path: User types a character → Yjs produces a CRDT operation → operation is appended to the event log → broadcast to other clients via WebSocket
- Read path: Document content is served from the in-memory Yjs projection. Search queries hit Elasticsearch projections. Version history reads from the snapshot store.
The two paths scale independently. Write throughput depends on the event log's append rate (PostgreSQL batch inserts). Read throughput depends on how projections are served (Redis for hot state, Elasticsearch for search, S3 for version history).
Schema Evolution
Events are immutable, but your system is not. New features add new event types, and old event types eventually need new fields. You cannot change what is already stored, so you need a migration strategy.
Three approaches work in practice:
Upcasting: Transform old events into the current schema at read time. When replaying event #1 (version 1 schema), an upcaster adds any missing fields with default values before the projection handler processes it. The stored event stays unchanged.
Versioned handlers: The projection handler checks the event version and applies different logic for each. TextInserted_v1 has position and text. TextInserted_v2 adds formatting marks. The handler processes both.
New event types: Instead of modifying TextInserted, introduce FormattedTextInserted as a new type. Old events remain as-is. New events use the new type. Projection handlers process both.
The worst approach is migrating old events in place. This breaks the immutability guarantee and can cause subtle bugs when replaying across schema boundaries.
When Event Sourcing is Overkill
None of this comes free. You are signing up for an append-only log, projections, snapshot management, schema evolution, and eventual consistency between the log and the projections. That is real operational overhead.
For simple CRUD applications where you never need to know "what happened before," storing current state directly is simpler, faster, and cheaper. A user profile that changes email addresses does not need to preserve every previous email in an event log.
Event sourcing pays off when:
- Audit trails are required (financial systems, healthcare, compliance)
- Temporal queries matter ("show me the document as it was at 3pm yesterday")
- Undo/redo is a core feature (collaborative editors, design tools)
- Debugging requires replay ("what sequence of events caused this state?")
- Multiple read models serve different access patterns from the same data
Collaborative editors are where event sourcing feels most at home. The operation log is already an event log. Snapshots are already projections. Version history is already temporal querying. Undo is already event reversal. You do not bolt the pattern on. It is how the system works from day one.
Key Points
- •Store every state change as an immutable event in an append-only log. Current state is derived by replaying events, not stored directly. You get a complete, auditable history of how the system reached any given state
- •Snapshots bound rebuild time. Without them, reconstructing state means replaying every event since the beginning. Periodic snapshots let you start from a recent checkpoint and replay only the events after it. The tradeoff: snapshot storage cost vs rebuild speed
- •CQRS (Command Query Responsibility Segregation) falls out of this design almost automatically. Writes append events to the log. Reads query pre-built projections optimized for specific access patterns. Separating the two lets each side scale independently
- •Event replay is the killer debugging tool. When something goes wrong, replay the event log up to the point of failure and inspect the exact state. No guessing, no log correlation, no 'works on my machine.' The events ARE the truth
- •Schema evolution is the hardest operational problem. Once events are persisted, their schema is frozen. Adding fields is easy (default values). Removing or renaming fields requires versioned event handlers that can process both old and new formats indefinitely
Used By
Common Mistakes
- ✗Not implementing snapshots from the start. A collaborative document with 50,000 edits takes seconds to rebuild from raw events. Without snapshots, every document open becomes a full replay. Add snapshot logic early, even if the initial event count is small
- ✗Using event sourcing for everything. CRUD-heavy domains with simple read patterns gain nothing from event sourcing and pay the full complexity cost. Reserve it for domains where audit history, temporal queries, or replay capability are actual requirements
- ✗Storing derived data in events instead of raw facts. Events should capture what happened (UserClickedButton, DocumentEdited), not what the system decided to do about it. Derived state belongs in projections, not in the event log
- ✗Ignoring event log growth. An active collaborative document generates thousands of events per hour. Without compaction, archival, or retention policies, the event log grows unboundedly. PostgreSQL performance degrades as tables exceed hundreds of millions of rows without partitioning