Algorithm patterns & code templates
End-to-end architecture problems
Design patterns & practical code
Deep dives into core tech
Production systems & operations
TCP, HTTP, TLS, DNS & more
Consensus, replication & more
Kernel, networking & system calls
Quick reference cards for interviews
“Staff engineer interviews must evaluate scope of influence and ambiguity tolerance, not just coding ability”
“Sustainable on-call rotations need a minimum of 6-8 people. Fewer than that and individuals end up on call too frequently, which leads to burnout and attrition”
The Prometheus long-term storage that does more with less hardware
Copies data across multiple nodes for fault tolerance and read scalability
Shard rebalancing moves data between nodes when you add capacity, remove nodes, or when load becomes uneven. The core challenge is doing this without downtime and without killing the performance of the live system that is still serving reads and writes