Code Review: This asyncio Service for Blocking Calls
An async FastAPI handler calling a sync DB driver, sync HTTP client, and time.sleep stalls the entire event loop. Tail latency goes from 50ms to 500ms across all concurrent requests. Walk through every blocking call and either replace with async-aware libraries or wrap with asyncio.to_thread.
The setup
A FastAPI service comes in for review. The team's complaint: "P99 latency is 30× the median. Median is 50ms, p99 is 1500ms. We can't figure out why."
The handler uses async def everywhere. Looks modern. But the p99 spike is the unmistakable signature of event loop stalls, somewhere in the handler, a synchronous call is blocking the only thread.
The mental model that solves this in 30 seconds
asyncio is single-threaded. The event loop runs one coroutine at a time. Every await is a yield point, control returns to the loop, other coroutines can run. A synchronous call (no await) blocks the entire loop until it finishes.
When a handler has 100 concurrent requests and one of them does time.sleep(0.5), all 100 see a 500ms stall. That's the p99 spike.
How to read the broken code
Open the Python tab. Walk through the handler line by line. For each call, ask: is this await-able? Does it touch the network, disk, or sleep? Every "yes" without an await is a stall.
The five issues in this snippet:
- psycopg2 query, sync DB driver, blocks during the I/O.
- requests.get, sync HTTP, blocks during the round-trip.
- time.sleep, blocks the OS thread.
- threading.Lock, blocks the OS thread while waiting (vs
asyncio.Lockwhich yields). - create_task without reference, task can be garbage-collected; errors vanish.
The diagnostic in production
Enable loop.set_debug(True) (or env PYTHONASYNCIODEBUG=1). asyncio will log every callback that runs longer than 100ms, the blocking calls will jump out. Pair with per-request tracing (OpenTelemetry) to find the slow spans.
The fix pattern
For each blocking call, two options:
| Option | When to use |
|---|---|
| Replace with async-aware library | When an async equivalent exists: aiohttp for requests, asyncpg for psycopg2, aiomysql for pymysql, asyncio.sleep for time.sleep |
Wrap with asyncio.to_thread | When no async version exists. Runs on a thread pool; the event loop stays responsive. |
The async-or-thread decision
- Many concurrent I/O ops, all to systems with async libraries → go full async.
- One critical legacy lib that's blocking → wrap with
to_threadand keep the rest async. - CPU-heavy work →
ProcessPoolExecutorvialoop.run_in_executor. - Library churn for "modernization" → don't. Migration cost is high; benefits only show at high concurrency.
How to communicate this in code review
What separates senior from junior reviewers Junior: "These are blocking calls; they need to be async."
Senior: "Here's the latency impact: each blocking call adds its full duration to every concurrent request's tail latency. That's why p99 is 30× median. The fix is library-by-library, let me walk through each one and the migration cost. The DB switch is the hardest; the rest are small. Here's an order-of-operations that minimizes risk..."
The senior version connects the bug to the metric, sequences the fix, and weighs effort. That's what gets candidates hired in code review interviews.
Beyond this review, the meta-lesson
async/await is not pixie dust
Adding async keywords doesn't make code faster. What makes async code fast is cooperative yielding at every I/O point. If even one piece doesn't yield, the model's value is thrown away. Audit every external call in every async function. The first sync call discovered is the latency bug.
Primitives by language
Implementations
Key points
- •async def + a synchronous call inside = silent disaster, the event loop stalls
- •All blocking I/O must be either async-native or wrapped in asyncio.to_thread
- •asyncio.create_task without a reference can be garbage-collected mid-run
- •asyncio.Lock and threading.Lock are NOT interchangeable
Follow-up questions
▸Why does ONE sync call inside async stall ALL concurrent requests?
▸How to detect blocking calls in production async code?
▸Is asyncio.to_thread fast?
▸Should threads be used instead of asyncio?
▸What's the right way to run a synchronous CPU task from asyncio?
Gotchas
- !requests.get inside async def → silent stall, no warning
- !time.sleep inside async def → freezes everything
- !threading.Lock in asyncio code → blocks the event loop instead of yielding
- !asyncio.create_task without keeping reference → task can be GC'd mid-run
- !Mixing sync DB drivers with FastAPI → tail latency surge under load
- !asyncio.gather with N=1M tasks → memory blow-up; bound with Semaphore
FastAPI services migrating from Flask hit this regularly. Discord, Lyft, and Cloudflare have published postmortems about asyncio + sync library issues. The fix is always 'replace the sync library' or 'wrap with to_thread.'