Real-World ScenariosTopic 10 of 18

ProblemHardSometimes

Scheduled Callback Executor (HashedWheelTimer)

In one line

Schedule callbacks to run at future times. Naive heap-based scheduler scales to thousands; for millions, use a HashedWheelTimer, bucketize callbacks by absolute deadline, sweep one bucket per tick. Used by Netty, Linux kernel timer wheel, Kafka.

The problem

Schedule N callbacks, each at a future time. As time advances, run callbacks at their scheduled time. Constraints:

Add: O(log N) or better.
Cancel: O(log N) or better.
Tick: process expired callbacks efficiently.
Scale: sometimes millions of in-flight timers (Kafka session timeouts, cache TTLs, message retries).

Note

What this tests Algorithm-design AND concurrency. Picking the right data structure (heap vs wheel) AND coordinating the tick thread with worker dispatch + safe insert/cancel.

The two-tier architecture

Tip

Production scheduler shape

Tick thread: minimal work, advance time, dispatch expired tasks.
Worker pool: runs the actual callbacks. Tick thread submits to pool.

Why separate: a long-running callback would otherwise block the next tick. Decoupling keeps timing accurate.

Where this scale matters

Service	Scheduled tasks	Why a wheel
Kafka broker	~1M (session timeouts per partition)	Heap can't keep up
Netty server	10K-100K (HTTP keep-alive, ping/pong)	Heap works but wheel is faster
Linux kernel	Millions (network timers, alarm clocks)	Cascading wheels
Cron daemon	Few hundred	Heap or even linear scan is fine

When NOT to build a custom one

Most apps need 10K timers max. ScheduledExecutorService (Java), time.AfterFunc (Go), loop.call_later (asyncio) all work fine. Build a custom wheel only when profiling shows the standard scheduler is the bottleneck.

Warning

The senior signal Knowing when to reach for HashedWheelTimer vs when standard tools suffice. Build complexity has to be justified. "I'd start with ScheduledThreadPoolExecutor; if profiling shows lock contention on the heap, swap to HashedWheelTimer." That's the right answer.

Implementations

ScheduledExecutorService, for moderate scale

Java's ScheduledThreadPoolExecutor is a heap-backed scheduler. Fine up to ~10K scheduled tasks. Beyond that, the heap rebalancing on every insert/cancel becomes the bottleneck.

 1  import java.util.concurrent.*;
 2  
 3  class SimpleScheduler {
 4      private final ScheduledExecutorService scheduler =
 5          Executors.newScheduledThreadPool(4);
 6  
 7      public ScheduledFuture<?> schedule(Runnable task, long delayMs) {
 8          return scheduler.schedule(task, delayMs, TimeUnit.MILLISECONDS);
 9      }
10  
11      public ScheduledFuture<?> scheduleAtRate(Runnable task, long initial, long periodMs) {
12          return scheduler.scheduleAtFixedRate(task, initial, periodMs, TimeUnit.MILLISECONDS);
13      }
14  
15      public void shutdown() throws InterruptedException {
16          scheduler.shutdown();
17          if (!scheduler.awaitTermination(10, TimeUnit.SECONDS)) {
18              scheduler.shutdownNow();
19          }
20      }
21  }

HashedWheelTimer concept (Netty-style)

Wheel of N buckets; current tick advances one bucket per tickMs. Adding a task: deadline → bucket index. On each tick: dispatch all expired tasks in that bucket. O(1) insert, no heap operations.

 1  // Conceptual sketch, production version is io.netty.util.HashedWheelTimer
 2  class HashedWheelTimer {
 3      private final List<List<Task>> wheel;          // size = wheelSize
 4      private final long tickMs;
 5      private final int wheelSize;
 6      private long currentTick = 0;
 7      private final Thread tickThread;
 8  
 9      public HashedWheelTimer(long tickMs, int wheelSize) {
10          this.tickMs = tickMs;
11          this.wheelSize = wheelSize;
12          this.wheel = new ArrayList<>(wheelSize);
13          for (int i = 0; i < wheelSize; i++) wheel.add(new LinkedList<>());
14  
15          tickThread = new Thread(this::run);
16          tickThread.start();
17      }
18  
19      public void schedule(Runnable r, long delayMs) {
20          long ticks = delayMs / tickMs;
21          long rounds = ticks / wheelSize;
22          int bucket = (int) ((currentTick + ticks) % wheelSize);
23          synchronized (wheel.get(bucket)) {
24              wheel.get(bucket).add(new Task(r, rounds));
25          }
26      }
27  
28      private void run() {
29          while (!Thread.interrupted()) {
30              try { Thread.sleep(tickMs); } catch (InterruptedException e) { return; }
31              List<Task> bucket = wheel.get((int) (currentTick % wheelSize));
32              Iterator<Task> it = bucket.iterator();
33              while (it.hasNext()) {
34                  Task t = it.next();
35                  if (t.rounds <= 0) {
36                      it.remove();
37                      executor.submit(t.runnable);
38                  } else {
39                      t.rounds--;
40                  }
41              }
42              currentTick++;
43          }
44      }
45  
46      record Task(Runnable runnable, long rounds) {}
47  }

Key points

•Heap-based: O(log N) insert, O(log N) extract, fine for thousands
•Hashed wheel: O(1) insert, O(N/buckets) per tick, scales to millions
•Trade precision for throughput: wheel has tick granularity (~10ms)
•Single tick thread + worker pool dispatches expired callbacks

Follow-up questions

▸When does a heap-based scheduler bottleneck?

With many concurrent timers being added/cancelled. Each operation is O(log N), and the heap mutex serializes them. Above ~50K timers, the lock becomes a hotspot. Wheels make insert O(1) at the cost of slightly imprecise expiration.

▸How does Linux's timer wheel work?

Cascading wheels: a 'second' wheel for tasks within ~5s, then 'tens-of-seconds' wheel, etc. Tasks far in the future demote down through wheels as time passes. Same idea as the kernel's timer subsystem and rate-limited netfilter rules.

▸What about cancellation?

Wheel-based: O(1), just mark the task as cancelled. The tick thread checks before dispatching. Heap-based: lazy delete (mark + skip) is O(1); eager delete is O(N) without cross-references; with parent pointers in the heap, O(log N).

▸Single tick thread vs multiple?

Almost always single. The tick thread does minimal work, it just dispatches to a worker pool. Multiple tick threads compete on the wheel locks; the contention is worse than the single-thread baseline.

Scheduled Callback Executor (HashedWheelTimer)

In one line

The problem

Schedule N callbacks, each at a future time. As time advances, run callbacks at their scheduled time. Constraints:

Add: O(log N) or better.
Cancel: O(log N) or better.
Tick: process expired callbacks efficiently.
Scale: sometimes millions of in-flight timers (Kafka session timeouts, cache TTLs, message retries).

Note

What this tests Algorithm-design AND concurrency. Picking the right data structure (heap vs wheel) AND coordinating the tick thread with worker dispatch + safe insert/cancel.

The two-tier architecture

Tip

Production scheduler shape

Tick thread: minimal work, advance time, dispatch expired tasks.
Worker pool: runs the actual callbacks. Tick thread submits to pool.

Why separate: a long-running callback would otherwise block the next tick. Decoupling keeps timing accurate.

Where this scale matters

Service	Scheduled tasks	Why a wheel
Kafka broker	~1M (session timeouts per partition)	Heap can't keep up
Netty server	10K-100K (HTTP keep-alive, ping/pong)	Heap works but wheel is faster
Linux kernel	Millions (network timers, alarm clocks)	Cascading wheels
Cron daemon	Few hundred	Heap or even linear scan is fine

When NOT to build a custom one

Warning

Implementations

ScheduledExecutorService, for moderate scale

Java's ScheduledThreadPoolExecutor is a heap-backed scheduler. Fine up to ~10K scheduled tasks. Beyond that, the heap rebalancing on every insert/cancel becomes the bottleneck.

 1  import java.util.concurrent.*;
 2  
 3  class SimpleScheduler {
 4      private final ScheduledExecutorService scheduler =
 5          Executors.newScheduledThreadPool(4);
 6  
 7      public ScheduledFuture<?> schedule(Runnable task, long delayMs) {
 8          return scheduler.schedule(task, delayMs, TimeUnit.MILLISECONDS);
 9      }
10  
11      public ScheduledFuture<?> scheduleAtRate(Runnable task, long initial, long periodMs) {
12          return scheduler.scheduleAtFixedRate(task, initial, periodMs, TimeUnit.MILLISECONDS);
13      }
14  
15      public void shutdown() throws InterruptedException {
16          scheduler.shutdown();
17          if (!scheduler.awaitTermination(10, TimeUnit.SECONDS)) {
18              scheduler.shutdownNow();
19          }
20      }
21  }

HashedWheelTimer concept (Netty-style)

Wheel of N buckets; current tick advances one bucket per tickMs. Adding a task: deadline → bucket index. On each tick: dispatch all expired tasks in that bucket. O(1) insert, no heap operations.

 1  // Conceptual sketch, production version is io.netty.util.HashedWheelTimer
 2  class HashedWheelTimer {
 3      private final List<List<Task>> wheel;          // size = wheelSize
 4      private final long tickMs;
 5      private final int wheelSize;
 6      private long currentTick = 0;
 7      private final Thread tickThread;
 8  
 9      public HashedWheelTimer(long tickMs, int wheelSize) {
10          this.tickMs = tickMs;
11          this.wheelSize = wheelSize;
12          this.wheel = new ArrayList<>(wheelSize);
13          for (int i = 0; i < wheelSize; i++) wheel.add(new LinkedList<>());
14  
15          tickThread = new Thread(this::run);
16          tickThread.start();
17      }
18  
19      public void schedule(Runnable r, long delayMs) {
20          long ticks = delayMs / tickMs;
21          long rounds = ticks / wheelSize;
22          int bucket = (int) ((currentTick + ticks) % wheelSize);
23          synchronized (wheel.get(bucket)) {
24              wheel.get(bucket).add(new Task(r, rounds));
25          }
26      }
27  
28      private void run() {
29          while (!Thread.interrupted()) {
30              try { Thread.sleep(tickMs); } catch (InterruptedException e) { return; }
31              List<Task> bucket = wheel.get((int) (currentTick % wheelSize));
32              Iterator<Task> it = bucket.iterator();
33              while (it.hasNext()) {
34                  Task t = it.next();
35                  if (t.rounds <= 0) {
36                      it.remove();
37                      executor.submit(t.runnable);
38                  } else {
39                      t.rounds--;
40                  }
41              }
42              currentTick++;
43          }
44      }
45  
46      record Task(Runnable runnable, long rounds) {}
47  }

Key points

•Heap-based: O(log N) insert, O(log N) extract, fine for thousands
•Hashed wheel: O(1) insert, O(N/buckets) per tick, scales to millions
•Trade precision for throughput: wheel has tick granularity (~10ms)
•Single tick thread + worker pool dispatches expired callbacks

Follow-up questions

▸When does a heap-based scheduler bottleneck?

▸How does Linux's timer wheel work?

▸What about cancellation?

▸Single tick thread vs multiple?

Scheduled Callback Executor (HashedWheelTimer)

The problem

The two-tier architecture

Where this scale matters

When NOT to build a custom one

Implementations

Key points

Follow-up questions

Related reading

Scheduled Callback Executor (HashedWheelTimer)

The problem

The two-tier architecture

Where this scale matters

When NOT to build a custom one

Implementations

Key points

Follow-up questions

Related reading