InfrastructureProblem 4 of 5
InfrastructureHardDeep Dive available
Design a Job Scheduler
Design a distributed job scheduler for 10M jobs/day with 100K concurrent, DAG dependencies, effectively-once execution, priority queues with aging, and cron scheduling. Sub-second dispatch, 99.99% scheduler availability. The hard problems are fencing-token correctness, denormalized DAG readiness, retry-storm protection, and multi-tenant fairness.
Key Topics
Cron SchedulingDAG DependenciesFencing TokensEffectively-Once ExecutionPriority Queues with AgingMulti-Tenant FairnessLeader ElectionRetry Storms & Backpressure