Core Mental ModelsTopic 19 of 19

ConceptIntermediateSometimes

When NOT to Use Concurrency

In one line

The best concurrent code is usually no concurrent code. Every added thread or task costs memory, debugging difficulty, and a chance for race conditions. Reach for concurrency only after measuring a real bottleneck where the workload's nature (I/O wait, CPU parallelism, latency hiding) actually benefits.

Diagram

The antidote to the rest of the curriculum

Concurrency is a tool with a real cost: more memory, more bugs, harder debugging, more complex code that every future contributor pays. The best concurrent code is often no concurrent code, a single thread doing one thing at a time.

This lesson is the design-thinking layer that separates engineers who reach for concurrency reflexively from those who reach for it deliberately.

The concurrency tax

Every spawned thread costs:

Memory          ~1 MB per OS thread, ~2 KB per goroutine, plus runtime bookkeeping
Cognitive       shared state, locks, ordering, cancellation, shutdown
Debugging       races don't reproduce on demand; production hangs are
                10x harder than serial bugs
Maintenance     every future contributor pays the complexity tax forever

Pay this tax only when the benefit is measured and meaningful.

When NOT to add concurrency

Warning

Common false alarms

"This script is slow." → Profile first. The fix is often an algorithm change or a missing index, not threads.
"This service might scale." → Run it. Horizontal scaling (more processes/instances) is almost always simpler than in-process concurrency.
"Async is the modern way." → Async is right for specific workloads (10K+ concurrent connections). For most CRUD apps, sync is faster to write and debug, and just as fast at runtime.
"This Python script is CPU-heavy, let's thread it." → The GIL means threads won't help. Either use multiprocessing or rewrite the hot path.
"Future-proofing for scale." → Future scale is unknown. Build for the load that exists, scale when it actually arrives.

When SHOULD concurrency be added

The honest checklist:

The bottleneck is measured. Profiler in hand, not "it feels slow."
The bottleneck is concurrency-fixable. If the bottleneck is the database, faster app-server threads don't help.
The workload nature matches the tool. I/O-bound + many concurrent ops → threads/async. CPU-bound + parallelizable → processes/goroutines.
The cost is articulable. Memory, complexity, lock strategy, shutdown plan.
Scaling out has been considered. Adding a second process is often simpler than adding threads to one process.

The "single-core rule"

Tip

If it can run on one core, run it on one core Redis serves 100K+ ops/sec on a single thread. Node.js handles millions of users on one event loop per process. SQLite is single-writer and powers more apps than every distributed database combined.

The skill of writing scaling-friendly single-threaded code, efficient algorithms, batching, avoiding redundant work, pays off more than mastering concurrency primitives. When the time to scale out arrives, scaling means adding more single-threaded processes, not threading the existing one.

The pattern that works at almost every scale

Process per CPU core (gunicorn, Puma, Unicorn, Fastify cluster mode)
    v
Each process: single-threaded event loop OR small thread pool
    v
External services for fan-out (DB connection pool, message queue, cache)

This is what powers Instagram (Django), GitHub (Rails), Shopify (Rails), Stripe (Ruby), Stack Overflow (.NET). None of them solved scaling by making in-process code more concurrent. They solved it by running more processes and offloading concurrency-heavy work to specialized infrastructure (Kafka, Redis, ElasticSearch).

The interview answer that wins

Note

When asked "how can this be made faster?" Don't lead with "parallelize it." Lead with "profile it." Then articulate: what's the bottleneck, is it amenable to concurrency, what's the cheapest fix that addresses it. Engineers who reflexively reach for concurrency get marked down. Engineers who measure first and apply the right tool get hired.

A useful self-check

Before adding concurrency to existing code, answer these in writing:

What metric will improve, and by how much?
What's the simplest single-threaded version, and have its optimizations been exhausted?
What new failure modes does this introduce (races, deadlocks, partial failures)?
What's the shutdown plan?
Is there a way to scale out (more processes/instances) instead of scaling up (more concurrency in-process)?

If the first two answers aren't crisp, the concurrency probably isn't worth it.

Implementations

BROKEN, over-engineered thread pool when one thread suffices

Reaching for ExecutorService to handle 10 background tasks per minute is overkill. The complexity (shutdown coordination, task tracking, exception handling) outweighs the benefit. A single dedicated thread or a ScheduledExecutorService.schedule is simpler and easier to reason about.

 1  // OVERKILL, for occasional background work
 2  ExecutorService pool = Executors.newFixedThreadPool(8);
 3  pool.submit(() -> sendDailySummaryEmail());
 4  // ... shutdown logic, task tracking, exception handling, monitoring ...
 5  
 6  // SIMPLER, for low-frequency work
 7  new Thread(() -> sendDailySummaryEmail(), "daily-summary").start();
 8  
 9  // OR, for scheduled work
10  ScheduledExecutorService scheduler = Executors.newSingleThreadScheduledExecutor();
11  scheduler.scheduleAtFixedRate(() -> sendDailySummaryEmail(),
12                                 0, 24, TimeUnit.HOURS);

Key points

•Concurrency is a cost, pay it only when the benefit is measured, not assumed
•Premature concurrency is worse than premature optimization, it can't be taken back without rewriting
•If the bottleneck is the database, adding threads in the app server doesn't help
•Under 1000 simultaneous users, async/await is rarely needed
•Single-threaded event-loop services (Redis, Node.js) prove single-threaded scales further than people think
•The ONE-CORE rule: if it can be solved on one core, do it on one core

Tradeoffs

Option	Pros	Cons	When to use
Single-threaded / synchronous	No locks No race conditions Easy to reason about Easy to debug Predictable performance	Doesn't scale beyond one core Blocking call halts everything	Most code. Until a bottleneck is measured, this is the right answer.
Multi-threaded / multi-process	Parallelism for CPU work Higher throughput when each task is meaningful work	Memory cost per thread Race conditions Lock contention Hard to debug	Measured CPU-bound bottleneck; tasks are coarse enough to amortize overhead
Async / event-loop	Massive I/O concurrency on small memory No thread overhead	Async-aware libraries required One blocking call freezes everything Steeper learning curve	Measured >1K concurrent connections OR connection memory is the bottleneck

Follow-up questions

▸When should code become concurrent?

When measurement shows it's I/O-bound with lots of concurrent operations (use threads or async), OR it's CPU-bound and the work parallelizes (use processes/goroutines). Don't add concurrency to make a single user request faster, that almost never works.

▸If concurrency is so risky, why does any production code use it?

Because at scale the single-threaded ceiling is real, Redis hits ~100K ops/sec on one core, then needs sharding. Most apps run a process per CPU core (gunicorn, Puma) and let the OS schedule them. The concurrency boundary is process-level, kept simple. Within each process: as little concurrency as possible.

▸What's the cheapest 'concurrency' that's actually free?

Running multiple processes (one per CPU). No shared state, no locks, OS handles scheduling. This is what most modern web stacks do, it's why Python web servers scale despite the GIL. The 'concurrency' is between requests/processes, not within them.

▸Is async worth using even for a small service?

Probably not, unless the framework is async-first (FastAPI, Sanic). The migration cost from sync to async is high (every library, every test, mental model change) and the benefit only kicks in at high concurrent connection counts.

▸When is concurrency NOT a choice, i.e., truly required?

GUI threads (must not block the UI), real-time systems (must meet deadlines), high-frequency trading (latency-bound), and serving > 10K concurrent connections (memory-bound on threads). For everything else, it's a choice, and often the wrong one.

Gotchas

!Threading.Thread() in a script that runs <1 minute → complexity cost paid for nothing
!asyncio.gather on 10 items → not enough concurrency to matter; just loop
!Adding sharding before measuring single-shard limits → premature, costs cohesion
!Concurrent code reads as 'sophisticated' in code reviews, bias toward simplicity
!'It might scale' is not a reason to add concurrency, measure first

Common pitfalls

Premature concurrency, added before there's a measured need
Over-engineered thread pools for sequential work
Async migration projects that take quarters and don't move user-facing metrics
Microservices that turn an in-process call into a distributed concurrency problem

Where this shows up

Redis is single-threaded, 100K+ ops/sec on one core proves how far simple scales. Django/Flask serve millions of req/sec via process-per-core, not thread-per-request. Most YC-stage startups don't need async, threads, or distributed systems, they need a faster database query.

When NOT to Use Concurrency

In one line

Diagram

The antidote to the rest of the curriculum

This lesson is the design-thinking layer that separates engineers who reach for concurrency reflexively from those who reach for it deliberately.

The concurrency tax

Every spawned thread costs:

Memory          ~1 MB per OS thread, ~2 KB per goroutine, plus runtime bookkeeping
Cognitive       shared state, locks, ordering, cancellation, shutdown
Debugging       races don't reproduce on demand; production hangs are
                10x harder than serial bugs
Maintenance     every future contributor pays the complexity tax forever

Pay this tax only when the benefit is measured and meaningful.

When NOT to add concurrency

Warning

Common false alarms

"This script is slow." → Profile first. The fix is often an algorithm change or a missing index, not threads.
"This service might scale." → Run it. Horizontal scaling (more processes/instances) is almost always simpler than in-process concurrency.
"Async is the modern way." → Async is right for specific workloads (10K+ concurrent connections). For most CRUD apps, sync is faster to write and debug, and just as fast at runtime.
"This Python script is CPU-heavy, let's thread it." → The GIL means threads won't help. Either use multiprocessing or rewrite the hot path.
"Future-proofing for scale." → Future scale is unknown. Build for the load that exists, scale when it actually arrives.

When SHOULD concurrency be added

The honest checklist:

The bottleneck is measured. Profiler in hand, not "it feels slow."
The bottleneck is concurrency-fixable. If the bottleneck is the database, faster app-server threads don't help.
The workload nature matches the tool. I/O-bound + many concurrent ops → threads/async. CPU-bound + parallelizable → processes/goroutines.
The cost is articulable. Memory, complexity, lock strategy, shutdown plan.
Scaling out has been considered. Adding a second process is often simpler than adding threads to one process.

The "single-core rule"

Tip

The pattern that works at almost every scale

Process per CPU core (gunicorn, Puma, Unicorn, Fastify cluster mode)
    v
Each process: single-threaded event loop OR small thread pool
    v
External services for fan-out (DB connection pool, message queue, cache)

The interview answer that wins

Note

A useful self-check

Before adding concurrency to existing code, answer these in writing:

What metric will improve, and by how much?
What's the simplest single-threaded version, and have its optimizations been exhausted?
What new failure modes does this introduce (races, deadlocks, partial failures)?
What's the shutdown plan?
Is there a way to scale out (more processes/instances) instead of scaling up (more concurrency in-process)?

If the first two answers aren't crisp, the concurrency probably isn't worth it.

Implementations

BROKEN, over-engineered thread pool when one thread suffices

 1  // OVERKILL, for occasional background work
 2  ExecutorService pool = Executors.newFixedThreadPool(8);
 3  pool.submit(() -> sendDailySummaryEmail());
 4  // ... shutdown logic, task tracking, exception handling, monitoring ...
 5  
 6  // SIMPLER, for low-frequency work
 7  new Thread(() -> sendDailySummaryEmail(), "daily-summary").start();
 8  
 9  // OR, for scheduled work
10  ScheduledExecutorService scheduler = Executors.newSingleThreadScheduledExecutor();
11  scheduler.scheduleAtFixedRate(() -> sendDailySummaryEmail(),
12                                 0, 24, TimeUnit.HOURS);

Key points

•Concurrency is a cost, pay it only when the benefit is measured, not assumed
•Premature concurrency is worse than premature optimization, it can't be taken back without rewriting
•If the bottleneck is the database, adding threads in the app server doesn't help
•Under 1000 simultaneous users, async/await is rarely needed
•Single-threaded event-loop services (Redis, Node.js) prove single-threaded scales further than people think
•The ONE-CORE rule: if it can be solved on one core, do it on one core

Tradeoffs

Option	Pros	Cons	When to use
Single-threaded / synchronous	No locks No race conditions Easy to reason about Easy to debug Predictable performance	Doesn't scale beyond one core Blocking call halts everything	Most code. Until a bottleneck is measured, this is the right answer.
Multi-threaded / multi-process	Parallelism for CPU work Higher throughput when each task is meaningful work	Memory cost per thread Race conditions Lock contention Hard to debug	Measured CPU-bound bottleneck; tasks are coarse enough to amortize overhead
Async / event-loop	Massive I/O concurrency on small memory No thread overhead	Async-aware libraries required One blocking call freezes everything Steeper learning curve	Measured >1K concurrent connections OR connection memory is the bottleneck

Follow-up questions

▸When should code become concurrent?

▸If concurrency is so risky, why does any production code use it?

▸What's the cheapest 'concurrency' that's actually free?

▸Is async worth using even for a small service?

▸When is concurrency NOT a choice, i.e., truly required?

Gotchas

!Threading.Thread() in a script that runs <1 minute → complexity cost paid for nothing
!asyncio.gather on 10 items → not enough concurrency to matter; just loop
!Adding sharding before measuring single-shard limits → premature, costs cohesion
!Concurrent code reads as 'sophisticated' in code reviews, bias toward simplicity
!'It might scale' is not a reason to add concurrency, measure first

Common pitfalls

Premature concurrency, added before there's a measured need
Over-engineered thread pools for sequential work
Async migration projects that take quarters and don't move user-facing metrics
Microservices that turn an in-process call into a distributed concurrency problem

Where this shows up

When NOT to Use Concurrency

Diagram

The antidote to the rest of the curriculum

The concurrency tax

When NOT to add concurrency

When SHOULD concurrency be added

The "single-core rule"

The pattern that works at almost every scale

The interview answer that wins

A useful self-check

Implementations

Key points

Tradeoffs

Follow-up questions

Gotchas

Common pitfalls

Related reading

When NOT to Use Concurrency

Diagram

The antidote to the rest of the curriculum

The concurrency tax

When NOT to add concurrency

When SHOULD concurrency be added

The "single-core rule"

The pattern that works at almost every scale

The interview answer that wins

A useful self-check

Implementations

Key points

Tradeoffs

Follow-up questions

Gotchas

Common pitfalls

Related reading