When NOT to Use Concurrency
The best concurrent code is usually no concurrent code. Every added thread or task costs memory, debugging difficulty, and a chance for race conditions. Reach for concurrency only after measuring a real bottleneck where the workload's nature (I/O wait, CPU parallelism, latency hiding) actually benefits.
Diagram
The antidote to the rest of the curriculum
Concurrency is a tool with a real cost: more memory, more bugs, harder debugging, more complex code that every future contributor pays. The best concurrent code is often no concurrent code, a single thread doing one thing at a time.
This lesson is the design-thinking layer that separates engineers who reach for concurrency reflexively from those who reach for it deliberately.
The concurrency tax
Every spawned thread costs:
Memory ~1 MB per OS thread, ~2 KB per goroutine, plus runtime bookkeeping
Cognitive shared state, locks, ordering, cancellation, shutdown
Debugging races don't reproduce on demand; production hangs are
10x harder than serial bugs
Maintenance every future contributor pays the complexity tax forever
Pay this tax only when the benefit is measured and meaningful.
When NOT to add concurrency
Common false alarms
- "This script is slow." → Profile first. The fix is often an algorithm change or a missing index, not threads.
- "This service might scale." → Run it. Horizontal scaling (more processes/instances) is almost always simpler than in-process concurrency.
- "Async is the modern way." → Async is right for specific workloads (10K+ concurrent connections). For most CRUD apps, sync is faster to write and debug, and just as fast at runtime.
- "This Python script is CPU-heavy, let's thread it." → The GIL means threads won't help. Either use multiprocessing or rewrite the hot path.
- "Future-proofing for scale." → Future scale is unknown. Build for the load that exists, scale when it actually arrives.
When SHOULD concurrency be added
The honest checklist:
- The bottleneck is measured. Profiler in hand, not "it feels slow."
- The bottleneck is concurrency-fixable. If the bottleneck is the database, faster app-server threads don't help.
- The workload nature matches the tool. I/O-bound + many concurrent ops → threads/async. CPU-bound + parallelizable → processes/goroutines.
- The cost is articulable. Memory, complexity, lock strategy, shutdown plan.
- Scaling out has been considered. Adding a second process is often simpler than adding threads to one process.
The "single-core rule"
If it can run on one core, run it on one core Redis serves 100K+ ops/sec on a single thread. Node.js handles millions of users on one event loop per process. SQLite is single-writer and powers more apps than every distributed database combined.
The skill of writing scaling-friendly single-threaded code, efficient algorithms, batching, avoiding redundant work, pays off more than mastering concurrency primitives. When the time to scale out arrives, scaling means adding more single-threaded processes, not threading the existing one.
The pattern that works at almost every scale
Process per CPU core (gunicorn, Puma, Unicorn, Fastify cluster mode)
v
Each process: single-threaded event loop OR small thread pool
v
External services for fan-out (DB connection pool, message queue, cache)
This is what powers Instagram (Django), GitHub (Rails), Shopify (Rails), Stripe (Ruby), Stack Overflow (.NET). None of them solved scaling by making in-process code more concurrent. They solved it by running more processes and offloading concurrency-heavy work to specialized infrastructure (Kafka, Redis, ElasticSearch).
The interview answer that wins
When asked "how can this be made faster?" Don't lead with "parallelize it." Lead with "profile it." Then articulate: what's the bottleneck, is it amenable to concurrency, what's the cheapest fix that addresses it. Engineers who reflexively reach for concurrency get marked down. Engineers who measure first and apply the right tool get hired.
A useful self-check
Before adding concurrency to existing code, answer these in writing:
- What metric will improve, and by how much?
- What's the simplest single-threaded version, and have its optimizations been exhausted?
- What new failure modes does this introduce (races, deadlocks, partial failures)?
- What's the shutdown plan?
- Is there a way to scale out (more processes/instances) instead of scaling up (more concurrency in-process)?
If the first two answers aren't crisp, the concurrency probably isn't worth it.
Implementations
Reaching for ExecutorService to handle 10 background tasks per minute is overkill. The complexity (shutdown coordination, task tracking, exception handling) outweighs the benefit. A single dedicated thread or a ScheduledExecutorService.schedule is simpler and easier to reason about.
1 // OVERKILL, for occasional background work
2 ExecutorService pool = Executors.newFixedThreadPool(8);
3 pool.submit(() -> sendDailySummaryEmail());
4 // ... shutdown logic, task tracking, exception handling, monitoring ...
5
6 // SIMPLER, for low-frequency work
7 new Thread(() -> sendDailySummaryEmail(), "daily-summary").start();
8
9 // OR, for scheduled work
10 ScheduledExecutorService scheduler = Executors.newSingleThreadScheduledExecutor();
11 scheduler.scheduleAtFixedRate(() -> sendDailySummaryEmail(),
12 0, 24, TimeUnit.HOURS);Key points
- •Concurrency is a cost, pay it only when the benefit is measured, not assumed
- •Premature concurrency is worse than premature optimization, it can't be taken back without rewriting
- •If the bottleneck is the database, adding threads in the app server doesn't help
- •Under 1000 simultaneous users, async/await is rarely needed
- •Single-threaded event-loop services (Redis, Node.js) prove single-threaded scales further than people think
- •The ONE-CORE rule: if it can be solved on one core, do it on one core
Tradeoffs
| Option | Pros | Cons | When to use |
|---|---|---|---|
| Single-threaded / synchronous |
|
| Most code. Until a bottleneck is measured, this is the right answer. |
| Multi-threaded / multi-process |
|
| Measured CPU-bound bottleneck; tasks are coarse enough to amortize overhead |
| Async / event-loop |
|
| Measured >1K concurrent connections OR connection memory is the bottleneck |
Follow-up questions
▸When should code become concurrent?
▸If concurrency is so risky, why does any production code use it?
▸What's the cheapest 'concurrency' that's actually free?
▸Is async worth using even for a small service?
▸When is concurrency NOT a choice, i.e., truly required?
Gotchas
- !Threading.Thread() in a script that runs <1 minute → complexity cost paid for nothing
- !asyncio.gather on 10 items → not enough concurrency to matter; just loop
- !Adding sharding before measuring single-shard limits → premature, costs cohesion
- !Concurrent code reads as 'sophisticated' in code reviews, bias toward simplicity
- !'It might scale' is not a reason to add concurrency, measure first
Common pitfalls
- Premature concurrency, added before there's a measured need
- Over-engineered thread pools for sequential work
- Async migration projects that take quarters and don't move user-facing metrics
- Microservices that turn an in-process call into a distributed concurrency problem
Redis is single-threaded, 100K+ ops/sec on one core proves how far simple scales. Django/Flask serve millions of req/sec via process-per-core, not thread-per-request. Most YC-stage startups don't need async, threads, or distributed systems, they need a faster database query.