Python ConcurrencyTopic 3 of 11

LanguagePythonIntermediateAsked Often

asyncio Fundamentals

In one line

asyncio runs many tasks on a single thread by interleaving them at await points. It scales to 100K+ concurrent I/O operations because each task is just a few KB instead of a thread's megabytes, but a single CPU-heavy synchronous call freezes the entire event loop.

Diagram

What it is

asyncio is Python's built-in framework for cooperative single-threaded concurrency. It allows code that looks sequential but interleaves at every await point, running thousands of tasks "concurrently" on a single thread.

The core idea: a coroutine is a function that can pause itself with await. The event loop runs one coroutine at a time; when one pauses, another runs. There's no preemption, no thread switching, no locks needed for shared state, just explicit yield points.

Why it matters

For I/O-heavy workloads, web servers, scrapers, gateways, message brokers, asyncio scales further than threads with much less memory. Discord's Python service handles millions of concurrent voice connections on asyncio. FastAPI services routinely handle 10K+ requests/sec on a single process because each request is a tiny coroutine, not a megabyte-stack thread.

Important

The mental model Threads = preemptive parallel waiting. asyncio = cooperative interleaved waiting. Both achieve concurrency for I/O. asyncio is cheaper per task; threads are easier to retrofit into existing sync code.

How it works

The diagram above traces it: the loop runs Task A until it hits an await, switches to Task B, comes back to A when its I/O completes.

async def fetch():
    async with session.get(url) as r:   # await yields here
        return await r.text()             # await yields here

The function pauses at each await. The event loop notes "this task is waiting for the network", switches to another task, and comes back when the network response arrives. The code reads top-to-bottom; to the runtime, it's a state machine that gets resumed across many event-loop iterations.

Note

await is a cooperative yield The function explicitly tells the runtime "I'm waiting on something, go run other work". Compare to threads, where the OS preempts at unpredictable points. Cooperative scheduling means race conditions are rare (only at await boundaries) but also means the code is responsible for yielding. A long synchronous loop never yields, and the event loop stalls.

The single biggest footgun

Calling sync code from async stalls the event loop.

async def handler():
    return requests.get(url).text  # ← blocks for 500ms

While this 500ms call runs, every other concurrent task, every websocket heartbeat, every other request, is paused. Tail latency goes from 50ms to 500ms across the board. This is the most common asyncio bug in the wild.

The fix: use an async-aware library (aiohttp instead of requests, asyncpg instead of psycopg2), or bridge with asyncio.to_thread(blocking_call).

Warning

async doesn't make code faster Calling time.sleep(1) inside async def still sleeps the whole thread for 1 second. Use await asyncio.sleep(1). Calling requests.get still blocks. Use aiohttp. Adding async def to a function doesn't make it cooperative, what's inside must be cooperative.

When to reach for asyncio

Workload	asyncio?
Web server handling 10K+ concurrent requests	✅ Yes
Scraping 1M URLs	✅ Yes (with bounded concurrency)
Long-poll websocket fan, chat, notifications	✅ Yes
Image processing on 1000 files	❌ No (CPU-bound, use ProcessPoolExecutor)
Quick script with 5 HTTP calls	❌ Overkill (use ThreadPoolExecutor)
Mixed sync libraries that can't be replaced	⚠️ Use `asyncio.to_thread` to bridge

The bounded-concurrency pattern

asyncio.gather(*1_000_000_tasks) will create a million simultaneously-active tasks and crash. The fix is a semaphore:

sem = asyncio.Semaphore(100)
async def bounded(coro):
    async with sem:
        return await coro

await asyncio.gather(*[bounded(work(x)) for x in million_items])

Now at most 100 tasks are in flight at a time. This is the canonical pattern for any large fan-out.

Mixing CPU work into an async service

Real services have hot paths that are CPU-heavy: JSON parsing of huge payloads, image transforms, ML inference. Don't run those on the event loop. Hand off to loop.run_in_executor with a ProcessPoolExecutor:

loop = asyncio.get_running_loop()
with ProcessPoolExecutor() as pool:
    result = await loop.run_in_executor(pool, cpu_heavy, args)

The event loop stays responsive; the CPU work runs on another core; the call is awaitable as usual.

Tip

The interview answer that wins "asyncio gives concurrency without parallelism. For I/O concurrency at scale, it's unbeatable. For CPU work, hand off to a ProcessPoolExecutor. The trap to avoid is running sync libraries inside an event loop; that stalls everything."

Primitives by language

async def / await
asyncio.run / create_task / gather / wait
asyncio.Queue
asyncio.Lock / Event / Semaphore
asyncio.to_thread (bridge to blocking code)
aiohttp / asyncpg / aiokafka (async-aware libraries)

Implementation

The hello-world of asyncio

async def defines a coroutine. await pauses execution until the awaited thing completes, and during the pause, the event loop runs other tasks. asyncio.run starts a new event loop and waits for the top-level coroutine.

 1  import asyncio
 2  
 3  async def hello(name, delay):
 4      await asyncio.sleep(delay)   # yields control to event loop
 5      print(f"hello {name}")
 6  
 7  async def main():
 8      # Run two tasks concurrently, total time = max(1, 2) = 2s, not 3s
 9      await asyncio.gather(
10          hello("alice", 1),
11          hello("bob", 2),
12      )
13  
14  asyncio.run(main())

100K HTTP requests with bounded concurrency

aiohttp is the async HTTP client that pairs with asyncio. A semaphore caps concurrent in-flight requests so the server doesn't get DOSed. Without the semaphore, 100K simultaneous connection attempts crash everything.

 1  import asyncio
 2  import aiohttp
 3  
 4  async def fetch(session, sem, url):
 5      async with sem:   # bounded concurrency
 6          async with session.get(url) as resp:
 7              return await resp.text()
 8  
 9  async def main(urls):
10      sem = asyncio.Semaphore(100)   # at most 100 in flight
11      async with aiohttp.ClientSession() as session:
12          tasks = [fetch(session, sem, u) for u in urls]
13          return await asyncio.gather(*tasks)
14  
15  urls = [f"https://api.example.com/items/{i}" for i in range(100_000)]
16  asyncio.run(main(urls))

asyncio.Queue, async producer-consumer

Same pattern as queue.Queue but await-able. Producers await q.put(item); consumers await q.get(). The sentinel pattern still works: put None to signal end-of-stream.

 1  import asyncio
 2  
 3  async def producer(q: asyncio.Queue):
 4      for i in range(100):
 5          await q.put(i)
 6      await q.put(None)   # sentinel
 7  
 8  async def consumer(q: asyncio.Queue, name: str):
 9      while True:
10          item = await q.get()
11          if item is None:
12              await q.put(None)   # propagate to siblings
13              break
14          await asyncio.sleep(0.01)   # simulate work
15          print(f"{name} processed {item}")
16  
17  async def main():
18      q = asyncio.Queue(maxsize=10)
19      await asyncio.gather(
20          producer(q),
21          consumer(q, "A"),
22          consumer(q, "B"),
23          consumer(q, "C"),
24      )
25  
26  asyncio.run(main())

The #1 footgun, calling sync code from async

requests.get is synchronous. Calling it inside async def blocks the entire event loop until the request completes. Every other task, including the heartbeat that keeps a websocket alive, freezes. asyncio.to_thread() runs blocking code on a thread without stalling.

 1  import asyncio
 2  import requests        # SYNCHRONOUS library
 3  import aiohttp         # ASYNC library
 4  
 5  # BROKEN, freezes the event loop for the duration of the request
 6  async def bad_fetch(url):
 7      return requests.get(url).text   # blocks for 500ms; nothing else runs
 8  
 9  # FIX 1, use an async-aware library
10  async def good_fetch(session, url):
11      async with session.get(url) as r:
12          return await r.text()
13  
14  # FIX 2, bridge to a thread for unavoidable blocking calls
15  async def bridged_fetch(url):
16      return await asyncio.to_thread(requests.get, url)

Cancellation and timeouts

asyncio.wait_for raises TimeoutError after the deadline; the task gets a CancelledError raised inside it at the next await. Always either re-raise CancelledError or do brief cleanup and re-raise; swallowing it breaks shutdown.

 1  import asyncio
 2  
 3  async def slow_op():
 4      try:
 5          await asyncio.sleep(10)
 6          return "done"
 7      except asyncio.CancelledError:
 8          # cleanup, then re-raise, DO NOT swallow
 9          print("cancelled, running cleanup")
10          raise
11  
12  async def main():
13      try:
14          # Wait at most 2 seconds
15          result = await asyncio.wait_for(slow_op(), timeout=2)
16      except asyncio.TimeoutError:
17          print("timed out")
18  
19  asyncio.run(main())

Mixing CPU work with asyncio, process pool bridge

For CPU-heavy work inside an async service, hand it off to a process pool via loop.run_in_executor. The event loop stays responsive; the CPU work runs in parallel on another core; the result is awaitable.

 1  import asyncio
 2  from concurrent.futures import ProcessPoolExecutor
 3  
 4  def cpu_heavy(n):
 5      # Bypasses GIL because it's in a separate process
 6      return sum(i * i for i in range(n))
 7  
 8  async def main():
 9      loop = asyncio.get_running_loop()
10      with ProcessPoolExecutor() as pool:
11          # Run CPU work without blocking the event loop
12          results = await asyncio.gather(*[
13              loop.run_in_executor(pool, cpu_heavy, 10_000_000)
14              for _ in range(8)
15          ])
16      print(sum(results))
17  
18  asyncio.run(main())

Key points

•asyncio is single-threaded, concurrency without parallelism
•await is a yield point: control returns to the event loop, other tasks run
•Tasks are cheap (~few KB); spawning 100K is realistic
•ANY blocking call (requests.get, time.sleep, DB driver) stalls the entire event loop
•asyncio.to_thread() bridges blocking code into the async world
•The library being called must be async-aware, sync libraries don't 'become' async

Tradeoffs

Option	Pros	Cons	When to use
asyncio (single-threaded)	100K+ concurrent I/O on one thread No locks needed (cooperative) Tiny memory per task	Single CPU-heavy call stalls everything Library must be async-aware Steeper learning curve	Massive I/O concurrency, web servers, scrapers, message brokers, websocket fans
threading	Easy to retrofit existing sync code Works with sync libraries	Higher per-task memory GIL still bottlenecks CPU Locks needed for shared state	Moderate I/O concurrency, mixed sync/async libraries, gradual adoption
multiprocessing	True CPU parallelism	Heavy startup, IPC overhead Wrong tool for I/O	CPU-bound work, keep separate from the async event loop

Follow-up questions

▸Why is asyncio single-threaded?

By design, cooperative scheduling on a single thread eliminates the need for locks and avoids GIL contention. Each await point is an explicit yield, so race conditions are dramatically rarer than with threads. The tradeoff is no CPU parallelism within one event loop.

▸What's the difference between gather and wait?

gather() waits for ALL tasks, returns results in input order, propagates exceptions. wait() returns (done, pending) sets and accepts return_when=FIRST_COMPLETED, FIRST_EXCEPTION, or ALL_COMPLETED. gather is the common case; wait is for first-finished semantics.

▸How is an asyncio task cancelled?

task.cancel() schedules a CancelledError to be raised inside the task at the next await. The task should catch it, do brief cleanup, and re-raise. asyncio.wait_for and asyncio.timeout() are higher-level wrappers that cancel + raise TimeoutError.

▸Can asyncio run inside a thread?

Yes, each thread can run its own event loop with asyncio.run(). Useful when integrating async code into a thread-pool-based legacy service. Don't share async objects (Locks, Queues) across threads, they're tied to one event loop.

▸Why does requests not work with asyncio?

requests is synchronous, its socket I/O blocks the calling thread. In an event loop, that blocks every concurrent task. Use aiohttp or httpx (async mode) instead, or wrap with asyncio.to_thread() when stuck with requests.

Gotchas

!Forgetting `await` calls a coroutine but doesn't run it; emits a 'coroutine was never awaited' warning
!Calling sync DB drivers inside async, silent event loop stall, tail latency spikes
!Swallowing CancelledError, breaks asyncio.wait_for and graceful shutdown
!Creating tasks without awaiting/storing them, task may be garbage collected mid-run
!Mixing asyncio.run() inside another running event loop, RuntimeError; use await or get_event_loop()
!Using time.sleep in async code; asyncio.sleep is the right tool (the sync one blocks the loop)

Common pitfalls

Treating asyncio as 'free parallelism', it's concurrency, not parallelism
Async-ifying CPU-bound code, adds overhead with no benefit
Library mismatch, half the dependencies are sync, half are async, the result is buggy
Unbounded asyncio.gather() with 1M tasks, memory blows up; use Semaphore to bound

Practice problems

Build an async port scanner that checks 65K ports concurrently

asyncio + Semaphore to bound concurrency, asyncio.open_connection for each port, gather results

Implement an async rate-limited API client

Token bucket or sliding window using asyncio.Lock + asyncio.sleep

APIs worth memorising

asyncio: run, create_task, gather, wait, wait_for, timeout, sleep, Queue, Lock, Event, Semaphore, to_thread
Async libraries: aiohttp, asyncpg, aiokafka, motor (mongo), httpx
Bridges: asyncio.to_thread, loop.run_in_executor (for ProcessPoolExecutor)

Where this shows up

FastAPI is built on asyncio. Sanic, Starlette, and aiohttp are pure async frameworks. Discord's Python service handles 5M+ concurrent voice connections via asyncio. Most modern Python infrastructure-y libraries (Kafka, Postgres, Redis clients) have async variants.

asyncio Fundamentals

In one line

Diagram

What it is

Why it matters

Important

How it works

The diagram above traces it: the loop runs Task A until it hits an await, switches to Task B, comes back to A when its I/O completes.

async def fetch():
    async with session.get(url) as r:   # await yields here
        return await r.text()             # await yields here

Note

The single biggest footgun

Calling sync code from async stalls the event loop.

async def handler():
    return requests.get(url).text  # ← blocks for 500ms

The fix: use an async-aware library (aiohttp instead of requests, asyncpg instead of psycopg2), or bridge with asyncio.to_thread(blocking_call).

Warning

When to reach for asyncio

Workload	asyncio?
Web server handling 10K+ concurrent requests	✅ Yes
Scraping 1M URLs	✅ Yes (with bounded concurrency)
Long-poll websocket fan, chat, notifications	✅ Yes
Image processing on 1000 files	❌ No (CPU-bound, use ProcessPoolExecutor)
Quick script with 5 HTTP calls	❌ Overkill (use ThreadPoolExecutor)
Mixed sync libraries that can't be replaced	⚠️ Use `asyncio.to_thread` to bridge

The bounded-concurrency pattern

asyncio.gather(*1_000_000_tasks) will create a million simultaneously-active tasks and crash. The fix is a semaphore:

sem = asyncio.Semaphore(100)
async def bounded(coro):
    async with sem:
        return await coro

await asyncio.gather(*[bounded(work(x)) for x in million_items])

Now at most 100 tasks are in flight at a time. This is the canonical pattern for any large fan-out.

Mixing CPU work into an async service

loop = asyncio.get_running_loop()
with ProcessPoolExecutor() as pool:
    result = await loop.run_in_executor(pool, cpu_heavy, args)

The event loop stays responsive; the CPU work runs on another core; the call is awaitable as usual.

Tip

Primitives by language

async def / await
asyncio.run / create_task / gather / wait
asyncio.Queue
asyncio.Lock / Event / Semaphore
asyncio.to_thread (bridge to blocking code)
aiohttp / asyncpg / aiokafka (async-aware libraries)

Implementation

The hello-world of asyncio

 1  import asyncio
 2  
 3  async def hello(name, delay):
 4      await asyncio.sleep(delay)   # yields control to event loop
 5      print(f"hello {name}")
 6  
 7  async def main():
 8      # Run two tasks concurrently, total time = max(1, 2) = 2s, not 3s
 9      await asyncio.gather(
10          hello("alice", 1),
11          hello("bob", 2),
12      )
13  
14  asyncio.run(main())

100K HTTP requests with bounded concurrency

 1  import asyncio
 2  import aiohttp
 3  
 4  async def fetch(session, sem, url):
 5      async with sem:   # bounded concurrency
 6          async with session.get(url) as resp:
 7              return await resp.text()
 8  
 9  async def main(urls):
10      sem = asyncio.Semaphore(100)   # at most 100 in flight
11      async with aiohttp.ClientSession() as session:
12          tasks = [fetch(session, sem, u) for u in urls]
13          return await asyncio.gather(*tasks)
14  
15  urls = [f"https://api.example.com/items/{i}" for i in range(100_000)]
16  asyncio.run(main(urls))

asyncio.Queue, async producer-consumer

Same pattern as queue.Queue but await-able. Producers await q.put(item); consumers await q.get(). The sentinel pattern still works: put None to signal end-of-stream.

 1  import asyncio
 2  
 3  async def producer(q: asyncio.Queue):
 4      for i in range(100):
 5          await q.put(i)
 6      await q.put(None)   # sentinel
 7  
 8  async def consumer(q: asyncio.Queue, name: str):
 9      while True:
10          item = await q.get()
11          if item is None:
12              await q.put(None)   # propagate to siblings
13              break
14          await asyncio.sleep(0.01)   # simulate work
15          print(f"{name} processed {item}")
16  
17  async def main():
18      q = asyncio.Queue(maxsize=10)
19      await asyncio.gather(
20          producer(q),
21          consumer(q, "A"),
22          consumer(q, "B"),
23          consumer(q, "C"),
24      )
25  
26  asyncio.run(main())

The #1 footgun, calling sync code from async

 1  import asyncio
 2  import requests        # SYNCHRONOUS library
 3  import aiohttp         # ASYNC library
 4  
 5  # BROKEN, freezes the event loop for the duration of the request
 6  async def bad_fetch(url):
 7      return requests.get(url).text   # blocks for 500ms; nothing else runs
 8  
 9  # FIX 1, use an async-aware library
10  async def good_fetch(session, url):
11      async with session.get(url) as r:
12          return await r.text()
13  
14  # FIX 2, bridge to a thread for unavoidable blocking calls
15  async def bridged_fetch(url):
16      return await asyncio.to_thread(requests.get, url)

Cancellation and timeouts

 1  import asyncio
 2  
 3  async def slow_op():
 4      try:
 5          await asyncio.sleep(10)
 6          return "done"
 7      except asyncio.CancelledError:
 8          # cleanup, then re-raise, DO NOT swallow
 9          print("cancelled, running cleanup")
10          raise
11  
12  async def main():
13      try:
14          # Wait at most 2 seconds
15          result = await asyncio.wait_for(slow_op(), timeout=2)
16      except asyncio.TimeoutError:
17          print("timed out")
18  
19  asyncio.run(main())

Mixing CPU work with asyncio, process pool bridge

 1  import asyncio
 2  from concurrent.futures import ProcessPoolExecutor
 3  
 4  def cpu_heavy(n):
 5      # Bypasses GIL because it's in a separate process
 6      return sum(i * i for i in range(n))
 7  
 8  async def main():
 9      loop = asyncio.get_running_loop()
10      with ProcessPoolExecutor() as pool:
11          # Run CPU work without blocking the event loop
12          results = await asyncio.gather(*[
13              loop.run_in_executor(pool, cpu_heavy, 10_000_000)
14              for _ in range(8)
15          ])
16      print(sum(results))
17  
18  asyncio.run(main())

Key points

•asyncio is single-threaded, concurrency without parallelism
•await is a yield point: control returns to the event loop, other tasks run
•Tasks are cheap (~few KB); spawning 100K is realistic
•ANY blocking call (requests.get, time.sleep, DB driver) stalls the entire event loop
•asyncio.to_thread() bridges blocking code into the async world
•The library being called must be async-aware, sync libraries don't 'become' async

Tradeoffs

Option	Pros	Cons	When to use
asyncio (single-threaded)	100K+ concurrent I/O on one thread No locks needed (cooperative) Tiny memory per task	Single CPU-heavy call stalls everything Library must be async-aware Steeper learning curve	Massive I/O concurrency, web servers, scrapers, message brokers, websocket fans
threading	Easy to retrofit existing sync code Works with sync libraries	Higher per-task memory GIL still bottlenecks CPU Locks needed for shared state	Moderate I/O concurrency, mixed sync/async libraries, gradual adoption
multiprocessing	True CPU parallelism	Heavy startup, IPC overhead Wrong tool for I/O	CPU-bound work, keep separate from the async event loop

Follow-up questions

▸Why is asyncio single-threaded?

▸What's the difference between gather and wait?

▸How is an asyncio task cancelled?

▸Can asyncio run inside a thread?

▸Why does requests not work with asyncio?

Gotchas

!Forgetting `await` calls a coroutine but doesn't run it; emits a 'coroutine was never awaited' warning
!Calling sync DB drivers inside async, silent event loop stall, tail latency spikes
!Swallowing CancelledError, breaks asyncio.wait_for and graceful shutdown
!Creating tasks without awaiting/storing them, task may be garbage collected mid-run
!Mixing asyncio.run() inside another running event loop, RuntimeError; use await or get_event_loop()
!Using time.sleep in async code; asyncio.sleep is the right tool (the sync one blocks the loop)

Common pitfalls

Treating asyncio as 'free parallelism', it's concurrency, not parallelism
Async-ifying CPU-bound code, adds overhead with no benefit
Library mismatch, half the dependencies are sync, half are async, the result is buggy
Unbounded asyncio.gather() with 1M tasks, memory blows up; use Semaphore to bound

Practice problems

Build an async port scanner that checks 65K ports concurrently

asyncio + Semaphore to bound concurrency, asyncio.open_connection for each port, gather results

Implement an async rate-limited API client

Token bucket or sliding window using asyncio.Lock + asyncio.sleep

APIs worth memorising

asyncio: run, create_task, gather, wait, wait_for, timeout, sleep, Queue, Lock, Event, Semaphore, to_thread
Async libraries: aiohttp, asyncpg, aiokafka, motor (mongo), httpx
Bridges: asyncio.to_thread, loop.run_in_executor (for ProcessPoolExecutor)

Where this shows up

asyncio Fundamentals

Diagram

What it is

Why it matters

How it works

The single biggest footgun

When to reach for asyncio

The bounded-concurrency pattern

Mixing CPU work into an async service

Primitives by language

Implementation

Key points

Tradeoffs

Follow-up questions

Gotchas

Common pitfalls

Practice problems

APIs worth memorising

Related reading

asyncio Fundamentals

Diagram

What it is

Why it matters

How it works

The single biggest footgun

When to reach for asyncio

The bounded-concurrency pattern

Mixing CPU work into an async service

Primitives by language

Implementation

Key points

Tradeoffs

Follow-up questions

Gotchas

Common pitfalls

Practice problems

APIs worth memorising

Related reading