Cancellation Propagation Across Services

Why this matters

The core question: when a user gives up on a request, does the server keep working on it?

If yes, that's a problem. The user has gone, but the server is still consuming CPU, holding database connections, calling downstream services. Multiply by request rate and the result is pool exhaustion and cascading failures from work nobody wants.

The answer is cancellation propagation. Every layer of the system has to know how to recognise "this request is dead" and unwind cleanly. Get this right and slow requests stop quickly. Get it wrong and a slow downstream silently doubles the in-flight count until something snaps.

Where cancellation comes from

Three sources, in order of frequency:

Client disconnect. The user closed their browser, killed the curl, or the network dropped. The TCP connection closes. The framework can detect this if asked. Go's r.Context() cancels automatically; Starlette has await request.is_disconnected(); Node's req.on('close') fires. Frameworks differ in how visible this signal is.

Deadline exceeded. A deadline is set (request budget, timeout, gRPC grpc-timeout header) and it fires. The handler should treat this exactly like a client disconnect: stop, unwind, return.

Parent abort. A multi-step operation has many sub-tasks; one fails or times out; the rest should be cancelled. Structured concurrency primitives (Java's StructuredTaskScope, Go's errgroup) do this automatically. Without them, the propagation has to be wired by hand and half the cases get forgotten.

How it propagates across services

Within a single process, propagation is mechanical: pass the cancellation handle (Go context, asyncio task, Java structured scope) through every function call. Every layer that respects it cancels naturally.

Across services, the wire protocol matters.

gRPC: ships grpc-timeout in request headers. The server reads it and uses it as the request context deadline. When the server makes further gRPC calls, the client library calculates the remaining time and sends a smaller grpc-timeout to the next hop. Deadlines propagate end-to-end with no application code.

HTTP/REST: no standard deadline header. Common workarounds: a custom header like X-Deadline or X-Request-Timeout agreed across services; per-hop timeouts that approximate the budget; rely on connection lifetime (the client closes the socket on its timeout, the server detects it).

Message queues: deadline is in the message body, application-implemented. Or rely on visibility timeouts to give up old jobs.

What goes wrong

Three production failure modes from missing cancellation:

A handler that doesn't pass ctx to downstream HTTP calls. Each request takes 30 seconds even after the client disconnects. Under load: thread pool fills with zombies. Healthy requests can't get a thread. Service goes down.

A long CPU loop with no cancellation checks. The runtime can't interrupt CPU code; only I/O. So even a perfectly-propagated deadline gets ignored once a for loop runs without an if ctx.Err() != nil { return } check.

Deadlines set at the wrong layer. Outer handler has a 5-second budget. Inner downstream call has a 30-second timeout. The downstream "wins" because the outer deadline fires but the inner doesn't propagate. Always set deadlines from the outside in, and make sure inner timeouts are smaller than outer.

Pass the cancellation handle through every layer (ctx in Go, structured scope or interrupt-aware blocking in Java, an explicit ctx parameter or asyncio.wait_for in Python). Detect client disconnect at the edge so the server stops working on requests the user has already abandoned. And use a wire protocol that propagates deadlines: gRPC does it natively; for HTTP, either agree on a custom header or accept per-hop timeouts.

The teams that get this right have services that stay responsive under load and shed gracefully when downstreams are slow. The teams that don't have outages caused by zombies piling up.

Follow-up questions

▸How is a client disconnect detected?

Depends on the framework. Go's net/http: r.Context().Done() fires. Node.js: req.on('close') fires. Starlette/FastAPI: await request.is_disconnected(). Java Servlet: AsyncListener.onTimeout. The detection isn't magic; the underlying TCP socket close is what triggers it. If the runtime doesn't expose this, the disconnect isn't detectable, and the only fallback is to enforce a deadline.

▸How does gRPC propagate deadlines?

On the wire, gRPC sends a grpc-timeout header (e.g., 'grpc-timeout: 500m' for 500ms). Server-side libraries set the request context deadline from this. When the server makes further gRPC calls and passes the ctx, the client library reads the remaining deadline from ctx and sends a smaller grpc-timeout. Each hop subtracts the elapsed time. This is deadline propagation across services, and it's the main reason gRPC is preferred over REST for inter-service calls.

▸What about plain HTTP? Is there a deadline header?

Not officially. Some teams use X-Request-Timeout or X-Deadline as a custom convention. Others rely on connection-level timeouts and the client disconnect signal. When both sides are under the team's control, gRPC or a custom header works. Otherwise, the only options are TCP and per-hop timeouts.

▸What's the difference between cancellation and timeout?

Timeout is a deadline; cancellation is the act of aborting work in progress. A timeout PRODUCES a cancellation when it fires. Cancellation can also come from other sources: parent task cancelled, user pressed cancel, system shutdown. The handler logic is the same: check ctx.Done() / is_cancelled(), unwind cleanly, release resources.

Why this matters

The core question: when a user gives up on a request, does the server keep working on it?

Where cancellation comes from

Three sources, in order of frequency:

Deadline exceeded. A deadline is set (request budget, timeout, gRPC grpc-timeout header) and it fires. The handler should treat this exactly like a client disconnect: stop, unwind, return.

How it propagates across services

Across services, the wire protocol matters.

Message queues: deadline is in the message body, application-implemented. Or rely on visibility timeouts to give up old jobs.

What goes wrong

Three production failure modes from missing cancellation:

The teams that get this right have services that stay responsive under load and shed gracefully when downstreams are slow. The teams that don't have outages caused by zombies piling up.

Follow-up questions

▸How is a client disconnect detected?

▸How does gRPC propagate deadlines?

▸What about plain HTTP? Is there a deadline header?

▸What's the difference between cancellation and timeout?

Why this matters

Where cancellation comes from

How it propagates across services

What goes wrong

Implementations

Key points

Follow-up questions

Gotchas

Related reading

Cancellation Propagation Across Services

Why this matters

Where cancellation comes from

How it propagates across services

What goes wrong

Implementations

Key points

Follow-up questions

Gotchas

Related reading