Code Review: This asyncio Service for Blocking Calls

The setup

A FastAPI service comes in for review. The team's complaint: "P99 latency is 30× the median. Median is 50ms, p99 is 1500ms. We can't figure out why."

The handler uses async def everywhere. Looks modern. But the p99 spike is the unmistakable signature of event loop stalls, somewhere in the handler, a synchronous call is blocking the only thread.

Important

The mental model that solves this in 30 seconds asyncio is single-threaded. The event loop runs one coroutine at a time. Every await is a yield point, control returns to the loop, other coroutines can run. A synchronous call (no await) blocks the entire loop until it finishes.

When a handler has 100 concurrent requests and one of them does time.sleep(0.5), all 100 see a 500ms stall. That's the p99 spike.

How to read the broken code

Open the Python tab. Walk through the handler line by line. For each call, ask: is this await-able? Does it touch the network, disk, or sleep? Every "yes" without an await is a stall.

The five issues in this snippet:

psycopg2 query, sync DB driver, blocks during the I/O.
requests.get, sync HTTP, blocks during the round-trip.
time.sleep, blocks the OS thread.
threading.Lock, blocks the OS thread while waiting (vs asyncio.Lock which yields).
create_task without reference, task can be garbage-collected; errors vanish.

Note

The diagnostic in production Enable loop.set_debug(True) (or env PYTHONASYNCIODEBUG=1). asyncio will log every callback that runs longer than 100ms, the blocking calls will jump out. Pair with per-request tracing (OpenTelemetry) to find the slow spans.

The fix pattern

For each blocking call, two options:

Option	When to use
Replace with async-aware library	When an async equivalent exists: `aiohttp` for `requests`, `asyncpg` for `psycopg2`, `aiomysql` for `pymysql`, `asyncio.sleep` for `time.sleep`
Wrap with `asyncio.to_thread`	When no async version exists. Runs on a thread pool; the event loop stays responsive.

Tip

The async-or-thread decision

Many concurrent I/O ops, all to systems with async libraries → go full async.
One critical legacy lib that's blocking → wrap with to_thread and keep the rest async.
CPU-heavy work → ProcessPoolExecutor via loop.run_in_executor.
Library churn for "modernization" → don't. Migration cost is high; benefits only show at high concurrency.

How to communicate this in code review

Important

What separates senior from junior reviewers Junior: "These are blocking calls; they need to be async."

Senior: "Here's the latency impact: each blocking call adds its full duration to every concurrent request's tail latency. That's why p99 is 30× median. The fix is library-by-library, let me walk through each one and the migration cost. The DB switch is the hardest; the rest are small. Here's an order-of-operations that minimizes risk..."

The senior version connects the bug to the metric, sequences the fix, and weighs effort. That's what gets candidates hired in code review interviews.

Beyond this review, the meta-lesson

Warning

async/await is not pixie dust Adding async keywords doesn't make code faster. What makes async code fast is cooperative yielding at every I/O point. If even one piece doesn't yield, the model's value is thrown away. Audit every external call in every async function. The first sync call discovered is the latency bug.

Follow-up questions

▸Why does ONE sync call inside async stall ALL concurrent requests?

asyncio is single-threaded. The event loop runs one coroutine at a time, switching between them at await points. A synchronous call blocks the only thread, no other coroutine can run until it returns. If 100 requests are concurrent and one of them sleeps for 500ms, all 100 see 500ms of stall.

▸How to detect blocking calls in production async code?

Set asyncio's debug mode (`PYTHONASYNCIODEBUG=1` or `loop.set_debug(True)`), it logs slow callbacks. Or use `aiomonitor` to dump active tasks. Production: track per-task duration; alert if any task occupies the loop for >50ms.

▸Is asyncio.to_thread fast?

Faster than threading.Thread (uses a shared default executor). Slower than native async (the cost is a context switch + Python's GIL release/reacquire). Use it as a bridge for unavoidable sync calls, not as a default.

▸Should threads be used instead of asyncio?

Without a need for 1000+ concurrent connections, threads are simpler. asyncio's value is at high concurrency where thread-per-task memory becomes prohibitive. For a web app handling 100 concurrent users, threading is fine and avoids the async-everywhere-or-nothing pitfall.

▸What's the right way to run a synchronous CPU task from asyncio?

ProcessPoolExecutor via `loop.run_in_executor`. The CPU work runs in another process (bypasses GIL), the event loop stays responsive. Don't run CPU work on the default thread executor, it still holds the GIL and stalls the loop.

The setup

A FastAPI service comes in for review. The team's complaint: "P99 latency is 30× the median. Median is 50ms, p99 is 1500ms. We can't figure out why."

Important

When a handler has 100 concurrent requests and one of them does time.sleep(0.5), all 100 see a 500ms stall. That's the p99 spike.

How to read the broken code

Open the Python tab. Walk through the handler line by line. For each call, ask: is this await-able? Does it touch the network, disk, or sleep? Every "yes" without an await is a stall.

The five issues in this snippet:

psycopg2 query, sync DB driver, blocks during the I/O.
requests.get, sync HTTP, blocks during the round-trip.
time.sleep, blocks the OS thread.
threading.Lock, blocks the OS thread while waiting (vs asyncio.Lock which yields).
create_task without reference, task can be garbage-collected; errors vanish.

Note

The fix pattern

For each blocking call, two options:

Option	When to use
Replace with async-aware library	When an async equivalent exists: `aiohttp` for `requests`, `asyncpg` for `psycopg2`, `aiomysql` for `pymysql`, `asyncio.sleep` for `time.sleep`
Wrap with `asyncio.to_thread`	When no async version exists. Runs on a thread pool; the event loop stays responsive.

Tip

The async-or-thread decision

Many concurrent I/O ops, all to systems with async libraries → go full async.
One critical legacy lib that's blocking → wrap with to_thread and keep the rest async.
CPU-heavy work → ProcessPoolExecutor via loop.run_in_executor.
Library churn for "modernization" → don't. Migration cost is high; benefits only show at high concurrency.

How to communicate this in code review

Important

What separates senior from junior reviewers Junior: "These are blocking calls; they need to be async."

The senior version connects the bug to the metric, sequences the fix, and weighs effort. That's what gets candidates hired in code review interviews.

Beyond this review, the meta-lesson

Warning

Follow-up questions

▸Why does ONE sync call inside async stall ALL concurrent requests?

▸How to detect blocking calls in production async code?

▸Is asyncio.to_thread fast?

▸Should threads be used instead of asyncio?

▸What's the right way to run a synchronous CPU task from asyncio?

The setup

How to read the broken code

The fix pattern

How to communicate this in code review

Beyond this review, the meta-lesson

Primitives by language

Implementations

Key points

Follow-up questions

Gotchas

Related reading

Code Review: This asyncio Service for Blocking Calls

The setup

How to read the broken code

The fix pattern

How to communicate this in code review

Beyond this review, the meta-lesson

Primitives by language

Implementations

Key points

Follow-up questions

Gotchas

Related reading