asyncio Fundamentals
asyncio runs many tasks on a single thread by interleaving them at await points. It scales to 100K+ concurrent I/O operations because each task is just a few KB instead of a thread's megabytes, but a single CPU-heavy synchronous call freezes the entire event loop.
Diagram
What it is
asyncio is Python's built-in framework for cooperative single-threaded concurrency. It allows code that looks sequential but interleaves at every await point, running thousands of tasks "concurrently" on a single thread.
The core idea: a coroutine is a function that can pause itself with await. The event loop runs one coroutine at a time; when one pauses, another runs. There's no preemption, no thread switching, no locks needed for shared state, just explicit yield points.
Why it matters
For I/O-heavy workloads, web servers, scrapers, gateways, message brokers, asyncio scales further than threads with much less memory. Discord's Python service handles millions of concurrent voice connections on asyncio. FastAPI services routinely handle 10K+ requests/sec on a single process because each request is a tiny coroutine, not a megabyte-stack thread.
The mental model Threads = preemptive parallel waiting. asyncio = cooperative interleaved waiting. Both achieve concurrency for I/O. asyncio is cheaper per task; threads are easier to retrofit into existing sync code.
How it works
The diagram above traces it: the loop runs Task A until it hits an await, switches to Task B, comes back to A when its I/O completes.
async def fetch():
async with session.get(url) as r: # await yields here
return await r.text() # await yields here
The function pauses at each await. The event loop notes "this task is waiting for the network", switches to another task, and comes back when the network response arrives. The code reads top-to-bottom; to the runtime, it's a state machine that gets resumed across many event-loop iterations.
await is a cooperative yield The function explicitly tells the runtime "I'm waiting on something, go run other work". Compare to threads, where the OS preempts at unpredictable points. Cooperative scheduling means race conditions are rare (only at await boundaries) but also means the code is responsible for yielding. A long synchronous loop never yields, and the event loop stalls.
The single biggest footgun
Calling sync code from async stalls the event loop.
async def handler():
return requests.get(url).text # ← blocks for 500ms
While this 500ms call runs, every other concurrent task, every websocket heartbeat, every other request, is paused. Tail latency goes from 50ms to 500ms across the board. This is the most common asyncio bug in the wild.
The fix: use an async-aware library (aiohttp instead of requests, asyncpg instead of psycopg2), or bridge with asyncio.to_thread(blocking_call).
async doesn't make code faster
Calling time.sleep(1) inside async def still sleeps the whole thread for 1 second. Use await asyncio.sleep(1). Calling requests.get still blocks. Use aiohttp. Adding async def to a function doesn't make it cooperative, what's inside must be cooperative.
When to reach for asyncio
| Workload | asyncio? |
|---|---|
| Web server handling 10K+ concurrent requests | ✅ Yes |
| Scraping 1M URLs | ✅ Yes (with bounded concurrency) |
| Long-poll websocket fan, chat, notifications | ✅ Yes |
| Image processing on 1000 files | ❌ No (CPU-bound, use ProcessPoolExecutor) |
| Quick script with 5 HTTP calls | ❌ Overkill (use ThreadPoolExecutor) |
| Mixed sync libraries that can't be replaced | ⚠️ Use asyncio.to_thread to bridge |
The bounded-concurrency pattern
asyncio.gather(*1_000_000_tasks) will create a million simultaneously-active tasks and crash. The fix is a semaphore:
sem = asyncio.Semaphore(100)
async def bounded(coro):
async with sem:
return await coro
await asyncio.gather(*[bounded(work(x)) for x in million_items])
Now at most 100 tasks are in flight at a time. This is the canonical pattern for any large fan-out.
Mixing CPU work into an async service
Real services have hot paths that are CPU-heavy: JSON parsing of huge payloads, image transforms, ML inference. Don't run those on the event loop. Hand off to loop.run_in_executor with a ProcessPoolExecutor:
loop = asyncio.get_running_loop()
with ProcessPoolExecutor() as pool:
result = await loop.run_in_executor(pool, cpu_heavy, args)
The event loop stays responsive; the CPU work runs on another core; the call is awaitable as usual.
The interview answer that wins "asyncio gives concurrency without parallelism. For I/O concurrency at scale, it's unbeatable. For CPU work, hand off to a ProcessPoolExecutor. The trap to avoid is running sync libraries inside an event loop; that stalls everything."
Primitives by language
- async def / await
- asyncio.run / create_task / gather / wait
- asyncio.Queue
- asyncio.Lock / Event / Semaphore
- asyncio.to_thread (bridge to blocking code)
- aiohttp / asyncpg / aiokafka (async-aware libraries)
Implementation
async def defines a coroutine. await pauses execution until the awaited thing completes, and during the pause, the event loop runs other tasks. asyncio.run starts a new event loop and waits for the top-level coroutine.
1 import asyncio
2
3 async def hello(name, delay):
4 await asyncio.sleep(delay) # yields control to event loop
5 print(f"hello {name}")
6
7 async def main():
8 # Run two tasks concurrently, total time = max(1, 2) = 2s, not 3s
9 await asyncio.gather(
10 hello("alice", 1),
11 hello("bob", 2),
12 )
13
14 asyncio.run(main())aiohttp is the async HTTP client that pairs with asyncio. A semaphore caps concurrent in-flight requests so the server doesn't get DOSed. Without the semaphore, 100K simultaneous connection attempts crash everything.
1 import asyncio
2 import aiohttp
3
4 async def fetch(session, sem, url):
5 async with sem: # bounded concurrency
6 async with session.get(url) as resp:
7 return await resp.text()
8
9 async def main(urls):
10 sem = asyncio.Semaphore(100) # at most 100 in flight
11 async with aiohttp.ClientSession() as session:
12 tasks = [fetch(session, sem, u) for u in urls]
13 return await asyncio.gather(*tasks)
14
15 urls = [f"https://api.example.com/items/{i}" for i in range(100_000)]
16 asyncio.run(main(urls))Same pattern as queue.Queue but await-able. Producers await q.put(item); consumers await q.get(). The sentinel pattern still works: put None to signal end-of-stream.
1 import asyncio
2
3 async def producer(q: asyncio.Queue):
4 for i in range(100):
5 await q.put(i)
6 await q.put(None) # sentinel
7
8 async def consumer(q: asyncio.Queue, name: str):
9 while True:
10 item = await q.get()
11 if item is None:
12 await q.put(None) # propagate to siblings
13 break
14 await asyncio.sleep(0.01) # simulate work
15 print(f"{name} processed {item}")
16
17 async def main():
18 q = asyncio.Queue(maxsize=10)
19 await asyncio.gather(
20 producer(q),
21 consumer(q, "A"),
22 consumer(q, "B"),
23 consumer(q, "C"),
24 )
25
26 asyncio.run(main())requests.get is synchronous. Calling it inside async def blocks the entire event loop until the request completes. Every other task, including the heartbeat that keeps a websocket alive, freezes. asyncio.to_thread() runs blocking code on a thread without stalling.
1 import asyncio
2 import requests # SYNCHRONOUS library
3 import aiohttp # ASYNC library
4
5 # BROKEN, freezes the event loop for the duration of the request
6 async def bad_fetch(url):
7 return requests.get(url).text # blocks for 500ms; nothing else runs
8
9 # FIX 1, use an async-aware library
10 async def good_fetch(session, url):
11 async with session.get(url) as r:
12 return await r.text()
13
14 # FIX 2, bridge to a thread for unavoidable blocking calls
15 async def bridged_fetch(url):
16 return await asyncio.to_thread(requests.get, url)asyncio.wait_for raises TimeoutError after the deadline; the task gets a CancelledError raised inside it at the next await. Always either re-raise CancelledError or do brief cleanup and re-raise; swallowing it breaks shutdown.
1 import asyncio
2
3 async def slow_op():
4 try:
5 await asyncio.sleep(10)
6 return "done"
7 except asyncio.CancelledError:
8 # cleanup, then re-raise, DO NOT swallow
9 print("cancelled, running cleanup")
10 raise
11
12 async def main():
13 try:
14 # Wait at most 2 seconds
15 result = await asyncio.wait_for(slow_op(), timeout=2)
16 except asyncio.TimeoutError:
17 print("timed out")
18
19 asyncio.run(main())For CPU-heavy work inside an async service, hand it off to a process pool via loop.run_in_executor. The event loop stays responsive; the CPU work runs in parallel on another core; the result is awaitable.
1 import asyncio
2 from concurrent.futures import ProcessPoolExecutor
3
4 def cpu_heavy(n):
5 # Bypasses GIL because it's in a separate process
6 return sum(i * i for i in range(n))
7
8 async def main():
9 loop = asyncio.get_running_loop()
10 with ProcessPoolExecutor() as pool:
11 # Run CPU work without blocking the event loop
12 results = await asyncio.gather(*[
13 loop.run_in_executor(pool, cpu_heavy, 10_000_000)
14 for _ in range(8)
15 ])
16 print(sum(results))
17
18 asyncio.run(main())Key points
- •asyncio is single-threaded, concurrency without parallelism
- •await is a yield point: control returns to the event loop, other tasks run
- •Tasks are cheap (~few KB); spawning 100K is realistic
- •ANY blocking call (requests.get, time.sleep, DB driver) stalls the entire event loop
- •asyncio.to_thread() bridges blocking code into the async world
- •The library being called must be async-aware, sync libraries don't 'become' async
Tradeoffs
| Option | Pros | Cons | When to use |
|---|---|---|---|
| asyncio (single-threaded) |
|
| Massive I/O concurrency, web servers, scrapers, message brokers, websocket fans |
| threading |
|
| Moderate I/O concurrency, mixed sync/async libraries, gradual adoption |
| multiprocessing |
|
| CPU-bound work, keep separate from the async event loop |
Follow-up questions
▸Why is asyncio single-threaded?
▸What's the difference between gather and wait?
▸How is an asyncio task cancelled?
▸Can asyncio run inside a thread?
▸Why does requests not work with asyncio?
Gotchas
- !Forgetting `await` calls a coroutine but doesn't run it; emits a 'coroutine was never awaited' warning
- !Calling sync DB drivers inside async, silent event loop stall, tail latency spikes
- !Swallowing CancelledError, breaks asyncio.wait_for and graceful shutdown
- !Creating tasks without awaiting/storing them, task may be garbage collected mid-run
- !Mixing asyncio.run() inside another running event loop, RuntimeError; use await or get_event_loop()
- !Using time.sleep in async code; asyncio.sleep is the right tool (the sync one blocks the loop)
Common pitfalls
- Treating asyncio as 'free parallelism', it's concurrency, not parallelism
- Async-ifying CPU-bound code, adds overhead with no benefit
- Library mismatch, half the dependencies are sync, half are async, the result is buggy
- Unbounded asyncio.gather() with 1M tasks, memory blows up; use Semaphore to bound
Practice problems
asyncio + Semaphore to bound concurrency, asyncio.open_connection for each port, gather results
Token bucket or sliding window using asyncio.Lock + asyncio.sleep
APIs worth memorising
- asyncio: run, create_task, gather, wait, wait_for, timeout, sleep, Queue, Lock, Event, Semaphore, to_thread
- Async libraries: aiohttp, asyncpg, aiokafka, motor (mongo), httpx
- Bridges: asyncio.to_thread, loop.run_in_executor (for ProcessPoolExecutor)
FastAPI is built on asyncio. Sanic, Starlette, and aiohttp are pure async frameworks. Discord's Python service handles 5M+ concurrent voice connections via asyncio. Most modern Python infrastructure-y libraries (Kafka, Postgres, Redis clients) have async variants.