Data & StorageProblem 2 of 8
Data ProcessingMediumDeep Dive available
Design a News Aggregator
Design a news aggregator like Google News. The system crawls 100K RSS sources, ingests 5M articles per day, deduplicates near-identical stories using MinHash + LSH, ranks articles using exponential decay scoring, detects breaking news via stream processing, and serves personalized feeds to 50M daily users at 10K requests/sec.
Key Topics
Adaptive Polling (Valkey Sorted Set)MinHash + LSH DeduplicationExponential Decay RankingBreaking News Detection (Flink)Personalized Feed AssemblyFan-out Architecture