MongoDB

MongoDB is everywhere. Anyone who has worked on a modern web stack has either used it or had to justify the alternative choice. The flexible document model and the low friction of getting started are genuinely useful, especially early in a project when the schema is still shifting. Since version 4.0, multi-document ACID transactions filled the biggest gap in the document model story, making Mongo a realistic option for transactional workloads that used to require a relational database. That said, do not confuse "viable" with "ideal." Postgres still wins for heavily relational data, and that is fine.

How It Works Internally

MongoDB stores data as BSON (Binary JSON) documents. BSON is a binary-encoded format that extends JSON with extra types: dates, binary data, ObjectId, Decimal128, regular expressions, and others. Each document can be up to 16MB and carries a unique _id field (typically an ObjectId encoding a timestamp, machine identifier, process ID, and counter).

Under the hood, the WiredTiger storage engine (default since 3.2) manages data through in-memory pages and on-disk checkpoints. It provides document-level concurrency control via optimistic locking, so multiple threads can write to different documents in the same collection at the same time. WiredTiger compresses with snappy by default (zstd and zlib are also available) and typically hits 50-70% compression ratios on real workloads. A write-ahead journal records every write before it touches the in-memory data files, with journal commits every 50ms by default. This is what keeps data intact if the process crashes.

Replication works through the oplog (operations log), a capped collection on the primary that records every write as an idempotent entry. Secondaries continuously tail this oplog and replay operations locally. When a primary goes down, the remaining members hold an election using a Raft-like protocol (introduced in 3.6) and promote the secondary with the most current oplog position. Elections typically finish in 10-12 seconds. During that window, writes to the replica set are unavailable. Plan for it.

Production Architecture

Start with a 3-member replica set per shard: one primary, two secondaries, spread across three availability zones. For sharded clusters, put at least two mongos routers behind a load balancer, plus a dedicated 3-member config server replica set that holds shard metadata and chunk distribution maps.

Now, shard key selection. This is the single decision that causes the most pain if it goes wrong. A good shard key has high cardinality, low frequency, and is non-monotonic. Hashed shard keys distribute writes evenly but kill range queries. Compound shard keys (e.g., {tenant_id: 1, created_at: 1}) offer targeted queries for a single tenant while still spreading data across shards. Before MongoDB 5.0, a shard key could not be changed after setting it. 5.0 introduced reshardCollection, but resharding a large cluster is still expensive and disruptive. Pick carefully the first time.

Turn on authentication, TLS on all connections, and audit logging in production. Use readPreference: secondaryPreferred for analytics queries to take load off the primary, but know that reads from secondaries can return stale data. For strong consistency, read from the primary.

Capacity Planning

A replica set on m5.2xlarge instances (8 vCPUs, 32GB RAM) with gp3 storage handles roughly 20,000-50,000 operations per second, depending on document size and index complexity. Set WiredTiger's internal cache to 50% of RAM minus 1GB (that is the default formula). Watch the wiredTiger.cache.bytes currently in the cache metric and keep the cache hit ratio above 95%.

The most important number to estimate is the working set size: the data plus indexes that get actively queried. If the working set fits in WiredTiger's cache, performance stays predictable. The moment it spills to disk, latency spikes and tail latencies get ugly. Monitor oplog.rs collection size and make sure it covers at least 48-72 hours of operations. This gives secondaries and disaster recovery processes enough buffer to catch up after maintenance or outages. Track globalLock.activeClients.readers and globalLock.activeClients.writers to spot concurrency bottlenecks before they bite in production.

Failure Scenarios

Scenario 1: Chunk migration storm from a bad shard key. Picking a monotonically increasing shard key (ObjectId, timestamp) and all writes land on the chunk at one shard's upper range. The balancer keeps splitting and migrating chunks to other shards, burning I/O and network bandwidth. At scale this creates a feedback loop: migrations slow down the overloaded shard, which causes more chunks to queue for migration. Spot this by checking sh.status() for jumbo chunks and looking at moveChunk operations in the mongos logs. The fix: use a hashed shard key or a compound shard key with a high-cardinality prefix. For existing deployments stuck with a bad key, MongoDB 5.0+ supports online resharding via reshardCollection, but expect it to take a while on large datasets.

Scenario 2: Oplog window exhaustion forcing a full resync. A secondary falls behind because of network issues or a heavy batch job on the primary. If the secondary's replication position falls off the end of the primary's oplog (a capped collection typically sized for 24-72 hours), it cannot resume incremental replication. It has to do a full initial sync, copying the entire dataset. For a 2TB shard, that means 12-24 hours over a 10Gbps network. During resync, read capacity is lost on that member and have no failover redundancy. Monitor rs.status() for optimeDate lag and replSetGetStatus.oplogWindow. Size the oplog generously with --oplogSize, alert when any secondary's lag exceeds 1 hour, and run batch operations on hidden members so production secondaries stay healthy.

How It Works Internally

Production Architecture

Capacity Planning

Failure Scenarios

Use Cases

Architecture

How It Works Internally

Production Architecture

Capacity Planning

Failure Scenarios

Pros

Cons

When to use

When NOT to use

Key Points

Common Mistakes

Related Technologies

MongoDB

Use Cases

Architecture

How It Works Internally

Production Architecture

Capacity Planning

Failure Scenarios

Pros

Cons

When to use

When NOT to use

Key Points

Common Mistakes

Related Technologies