Backpressure: Strategies and Signals
When the producer is faster than the consumer, something has to give. Backpressure is how the system tells the producer to slow down (or what happens when it can't). Four strategies: block the producer, drop new work, shed load with errors, or propagate the slowdown back through the call chain. The wrong choice turns a slow consumer into a memory blowup or a thundering retry storm.
What it is
Backpressure is what happens when one part of a system can't keep up with another. Producer faster than consumer. Downstream slower than upstream. Without an explicit strategy, the slow side has only bad options: things pile up in memory until the process dies, or work gets silently dropped, or threads block forever.
Picking a backpressure strategy turns "slow consumer crashes the system" into "slow consumer slows the producer."
The basic shape:
Producer -----> Queue -----> Consumer
(fast) (filling) (slow)
As the queue fills, the system has to decide:
what happens when it's full?
The four strategies, with pictures
1. BLOCK Queue full > producer waits.
Producer --> [████████████] --> Consumer
^ full
Producer is blocked, Consumer drains, then producer can push again.
Producer's rate caps at consumer's rate automatically.
2. DROP Queue full > throw away new (or oldest).
Producer --> [████████████] --> Consumer
^ full, drop new item or evict oldest
New events: silently lost OR oldest events lost.
Counter goes up: "dropped 1,234 events in last minute."
3. SHED Queue full > reject incoming with an error.
Producer --> (503 Service Unavailable, Retry-After: 5s)
Caller decides: retry, fall back, fail-fast, degrade.
4. PROPAGATE Pass a deadline through every layer.
Producer --> [....] --> Consumer (slow)
^ ^
| "must finish by t+5s" passes through every call
|
Each layer checks the deadline. If exceeded, fail fast,
propagate the timeout upward.
When each one fits
BLOCK Use when:
- Producer is a worker (not a request handler)
- Capping producer rate at consumer rate is acceptable
- Response latency is not a concern
Examples: batch ETL, log shipping, background workers
DROP Use when:
- The data is fungible (one sample as good as another)
- Stale is worse than missing
Examples: metrics, telemetry, sensor readings, log sampling
ALWAYS instrument the drop count.
SHED Use when:
- The code is at the edge of a service
- Queueing the request buys nothing (it would time out anyway)
- Caller can retry / fall back
Examples: HTTP request handlers under load, API gateways
PROPAGATE Use when:
- The code crosses service boundaries
- The caller cares about end-to-end latency
- Timeouts should free up resources cheaply
Examples: RPC chains, microservice calls
The most common mistake is using "block" everywhere. Blocking in a request handler ties up threads waiting for capacity that isn't coming. The right pattern at the edge is shed; internally, bounded queue or propagate.
Picking the right one
Match the strategy to the workload:
| Scenario | Strategy |
|---|---|
| Batch / ETL pipeline | Block (bounded queue) |
| Telemetry / metrics / sampled data | Drop with metrics |
| User-facing request handler | Shed with 503 + Retry-After |
| RPC chain across services | Propagate via deadline |
| Mixed workload | Combination: bounded queue inside, deadline at the edge |
The most common mistake is using "block" everywhere. Blocking in a request handler ties up threads waiting for capacity that isn't coming. The right pattern at the edge is shed; the right pattern internally is bounded queue or propagate.
What goes wrong without it
Three production failure modes directly attributable to missing backpressure:
A queue with no bound. Producer faster than consumer for any sustained period. Memory grows. OOM. Postmortem says "the queue should have been bounded".
A bounded queue that drops silently. No metric, no alert. Symptom: customers report missing data weeks later. The queue has been dropping 5% of events for months.
No deadline propagation. Downstream gets slow. Upstream blocks waiting. Threads pile up. Eventually upstream is unreachable too. The "slow downstream" turned into a "service down" because there was no shedding.
Each one is preventable in five lines of code, given that backpressure was considered when designing the data flow.
Backpressure is the question every concurrent system must answer: when supply exceeds capacity, what happens? Pick one of the four strategies (block, drop, shed, propagate) per data flow, document it, and instrument it. The system designs that survive production traffic are the ones where someone made this choice deliberately.
Implementations
The most common backpressure: a bounded BlockingQueue. When full, put() blocks until a consumer takes. The producer naturally slows to match the consumer's rate. Cost: producer threads sit idle while blocked. Acceptable when the producer has nothing better to do.
1 import java.util.concurrent.ArrayBlockingQueue;
2 import java.util.concurrent.BlockingQueue;
3
4 BlockingQueue<Job> q = new ArrayBlockingQueue<>(1000);
5
6 // Producer
7 void produce(Job j) throws InterruptedException {
8 q.put(j); // BLOCKS if queue is full
9 }
10
11 // Consumer
12 Job take() throws InterruptedException {
13 return q.take(); // blocks if empty
14 }
15
16 // The bounded size IS the backpressure mechanism. No extra code needed.For telemetry, log shipping, sampled events: stale data is worse than missing data. offer() returns false instead of blocking; the drop is counted via a metric. Always count drops or data loss becomes invisible.
1 BlockingQueue<Event> q = new ArrayBlockingQueue<>(10_000);
2 AtomicLong dropped = new AtomicLong();
3
4 void emit(Event e) {
5 if (!q.offer(e)) { // returns false if full
6 dropped.incrementAndGet(); // never drop silently
7 }
8 }
9
10 // For "drop oldest" instead of "drop newest", pre-evict on full:
11 void emitDropOldest(Event e) {
12 while (!q.offer(e)) {
13 q.poll(); // remove oldest, retry
14 dropped.incrementAndGet();
15 }
16 }Key points
- •Without backpressure, a slow consumer forces unbounded memory growth or work loss
- •Block the producer (bounded queue blocks on full): natural backpressure, default for most pipelines
- •Drop incoming (drop newest or drop oldest): for telemetry where stale data is worse than missing data
- •Shed load (return 503 / fail fast): for request paths where the client can retry or degrade
- •Propagate (the slow downstream signals the upstream caller): for end-to-end flow control across services
Follow-up questions
▸What happens without backpressure?
▸Block, drop, or shed: how to choose?
▸Why is propagation special?
▸How does backpressure relate to the bulkhead pattern?
Gotchas
- !Unbounded queue = no backpressure = memory blowup under sustained overload
- !Drop without metrics = silent data loss (count and alert on drops)
- !Shed without Retry-After = client retries immediately, makes things worse
- !Block in a request handler = thread starvation; combine with timeouts
- !Propagate without timeouts = nothing actually propagates; ctx.Done is the channel