Concurrency vs Parallelism

Two words, two different things

CONCURRENCY                          PARALLELISM
-----------                          -----------
About STRUCTURE                      About EXECUTION
"How my program is organized"        "What's actually running right now"
Tasks can take turns on 1 core       Tasks really run at the same instant
                                     on different hardware

A program can be:

Concurrent on one core (tasks take turns; asyncio, Python with the GIL).
Parallel without much concurrency (a SIMD matrix multiply).
Both at once (a multi-threaded web server on multiple cores).

A picture

In each picture below, read each vertical column top to bottom: that column shows what is running on every core at that wall-clock instant.

ONE CORE: Concurrency only

Wall clock -------------------------------------->

Task A:  ███ ░░░ ░░░ ███ ░░░ ░░░ ███ ░░░ ░░░
Task B:  ░░░ ███ ░░░ ░░░ ███ ░░░ ░░░ ███ ░░░
Task C:  ░░░ ░░░ ███ ░░░ ░░░ ███ ░░░ ░░░ ███

           ███ = running on the core
           ░░░ = waiting its turn

  At every column, exactly ONE task has ███. The OS rotates the
  three tasks through the single core. The user feels concurrency,
  but only 1 task is actually executing at any wall-clock instant.


FOUR CORES: Concurrency + Parallelism

Wall clock -------------------------------------->

Core 0:  ███ ███ ███ ███ ███ ███ ███ ███ ███   (Task A)
Core 1:  ███ ███ ███ ███ ███ ███ ███ ███ ███   (Task B)
Core 2:  ███ ███ ███ ███ ███ ███ ███ ███ ███   (Task C)
Core 3:  ███ ███ ███ ███ ███ ███ ███ ███ ███   (Task D)

  At every column, FOUR tasks have ███. They really do run at
  the same wall-clock instant on different physical cores.
  That's parallelism.

The kitchen analogy: one chef switching between three dishes is concurrent. Three chefs each cooking one dish is parallel. Three chefs each rotating across three dishes is both.

Why people confuse them and pay for it

Most "make this faster with threads" instincts are wrong because the developer mixed up the two. Adding threads to Python CPU code makes it slower (the GIL serializes execution; the result is concurrency overhead with no parallel speedup). Spinning up 10,000 goroutines for CPU-bound work doesn't help past the core count. async/await provides concurrency but zero parallel speedup.

Note

Rob Pike's line "Concurrency is not parallelism, but it enables it." Concurrency is the design: independent tasks, well-defined coordination. Parallelism is what emerges when the runtime maps those tasks onto multiple cores.

The single test

Ask: "At any given instant, how many tasks are executing CPU instructions?"

One = concurrency only.
N = parallelism. (And if the code is structured for it, concurrency is present too.)

When to reach for what

Workload	Right choice
10K I/O-bound connections, low CPU	Concurrency (asyncio, goroutines, virtual threads)
Bulk numerical compute, single dataset	Parallelism (multiprocessing, ForkJoinPool, GPU)
Web server: many requests, each does CPU + I/O	Both (threads/goroutines on multi-core)
Stream processing pipeline	Both, with explicit stages

Warning

The trap "I'll use threads to make this faster." If the work is CPU-bound in Python, threads make it slower, the GIL forces serialization while context-switch cost adds up. Reach for multiprocessing instead, or rewrite the hot path.

How this changes the code

Concurrency forces designs around coordination: shared state, channels, locks, cancellation.
Parallelism forces designs around data dependencies: what must happen before what; can the work be split into independent shards.

The two design conversations are different. A good interviewer probes both, "how do these tasks share state?" (concurrency) and "what's the speedup with more cores?" (parallelism).

Option	Pros	Cons	When to use
Concurrency without parallelism (asyncio, single-threaded event loop)	No locks needed Cheap context switching Scales to 100K+ connections	No CPU speedup One blocking call freezes everything Hard to use existing blocking libs	I/O-bound workloads, web servers, message brokers, network proxies
Parallelism without concurrency (data-parallel)	Maximum CPU utilization Simple mental model, same op on every element GPU-friendly	Doesn't help with overlapping I/O Bound by data dependencies	CPU-bound batch jobs, image processing, numerical compute, bulk transforms
Concurrency with parallelism (threads/goroutines on multi-core)	Best of both Scales with cores AND with I/O concurrency	Hardest to get right, locks, ordering, races Debugging is painful	Most production servers, handle many requests, use all cores

Option

Pros

Cons

When to use

Concurrency without parallelism (asyncio, single-threaded event loop)

No locks needed
Cheap context switching
Scales to 100K+ connections

No CPU speedup
One blocking call freezes everything
Hard to use existing blocking libs

I/O-bound workloads, web servers, message brokers, network proxies

Parallelism without concurrency (data-parallel)

Maximum CPU utilization
Simple mental model, same op on every element
GPU-friendly

Doesn't help with overlapping I/O
Bound by data dependencies

CPU-bound batch jobs, image processing, numerical compute, bulk transforms

Concurrency with parallelism (threads/goroutines on multi-core)

Best of both
Scales with cores AND with I/O concurrency

Hardest to get right, locks, ordering, races
Debugging is painful

Most production servers, handle many requests, use all cores

Two words, two different things

CONCURRENCY PARALLELISM ----------- ----------- About STRUCTURE About EXECUTION "How my program is organized" "What's actually running right now" Tasks can take turns on 1 core Tasks really run at the same instant on different hardware

A program can be:

Concurrent on one core (tasks take turns; asyncio, Python with the GIL).

Parallel without much concurrency (a SIMD matrix multiply).

Both at once (a multi-threaded web server on multiple cores).

A picture

In each picture below, read each vertical column top to bottom: that column shows what is running on every core at that wall-clock instant.

ONE CORE: Concurrency only Wall clock --------------------------------------> Task A: ███ ░░░ ░░░ ███ ░░░ ░░░ ███ ░░░ ░░░ Task B: ░░░ ███ ░░░ ░░░ ███ ░░░ ░░░ ███ ░░░ Task C: ░░░ ░░░ ███ ░░░ ░░░ ███ ░░░ ░░░ ███ ███ = running on the core ░░░ = waiting its turn At every column, exactly ONE task has ███. The OS rotates the three tasks through the single core. The user feels concurrency, but only 1 task is actually executing at any wall-clock instant. FOUR CORES: Concurrency + Parallelism Wall clock --------------------------------------> Core 0: ███ ███ ███ ███ ███ ███ ███ ███ ███ (Task A) Core 1: ███ ███ ███ ███ ███ ███ ███ ███ ███ (Task B) Core 2: ███ ███ ███ ███ ███ ███ ███ ███ ███ (Task C) Core 3: ███ ███ ███ ███ ███ ███ ███ ███ ███ (Task D) At every column, FOUR tasks have ███. They really do run at the same wall-clock instant on different physical cores. That's parallelism.

The kitchen analogy: one chef switching between three dishes is concurrent. Three chefs each cooking one dish is parallel. Three chefs each rotating across three dishes is both.

Why people confuse them and pay for it

Note

When to reach for what

Workload

Right choice

10K I/O-bound connections, low CPU

Concurrency (asyncio, goroutines, virtual threads)

Bulk numerical compute, single dataset

Parallelism (multiprocessing, ForkJoinPool, GPU)

Web server: many requests, each does CPU + I/O

Both (threads/goroutines on multi-core)

Stream processing pipeline

Both, with explicit stages

Warning

How this changes the code

Concurrency forces designs around coordination: shared state, channels, locks, cancellation.

Parallelism forces designs around data dependencies: what must happen before what; can the work be split into independent shards.

The two design conversations are different. A good interviewer probes both, "how do these tasks share state?" (concurrency) and "what's the speedup with more cores?" (parallelism).

Option	Pros	Cons	When to use
Concurrency without parallelism (asyncio, single-threaded event loop)	No locks needed Cheap context switching Scales to 100K+ connections	No CPU speedup One blocking call freezes everything Hard to use existing blocking libs	I/O-bound workloads, web servers, message brokers, network proxies
Parallelism without concurrency (data-parallel)	Maximum CPU utilization Simple mental model, same op on every element GPU-friendly	Doesn't help with overlapping I/O Bound by data dependencies	CPU-bound batch jobs, image processing, numerical compute, bulk transforms
Concurrency with parallelism (threads/goroutines on multi-core)	Best of both Scales with cores AND with I/O concurrency	Hardest to get right, locks, ordering, races Debugging is painful	Most production servers, handle many requests, use all cores

Option

Pros

Cons

When to use

Concurrency without parallelism (asyncio, single-threaded event loop)

No locks needed
Cheap context switching
Scales to 100K+ connections

No CPU speedup
One blocking call freezes everything
Hard to use existing blocking libs

I/O-bound workloads, web servers, message brokers, network proxies

Parallelism without concurrency (data-parallel)

Maximum CPU utilization
Simple mental model, same op on every element
GPU-friendly

Doesn't help with overlapping I/O
Bound by data dependencies

CPU-bound batch jobs, image processing, numerical compute, bulk transforms

Concurrency with parallelism (threads/goroutines on multi-core)

Best of both
Scales with cores AND with I/O concurrency

Hardest to get right, locks, ordering, races
Debugging is painful

Most production servers, handle many requests, use all cores

Diagram

Two words, two different things

A picture

Why people confuse them and pay for it

The single test

When to reach for what

How this changes the code

Key points

Tradeoffs

Follow-up questions

Gotchas

Common pitfalls

Related reading

Concurrency vs Parallelism

Diagram

Two words, two different things

A picture

Why people confuse them and pay for it

The single test

When to reach for what

How this changes the code

Key points

Tradeoffs

Follow-up questions

Gotchas

Common pitfalls

Related reading