Process vs Thread vs Goroutine
A process owns memory and resources; threads share the process's address space and are scheduled by the OS; goroutines are user-space tasks multiplexed onto a small pool of OS threads by the Go runtime.
Diagram
Three units, three cost profiles
The diagram above shows the containment chain: a process holds threads; a thread can run many goroutines or virtual threads. The cost picture in numbers:
| Process | Thread | Goroutine / Virtual Thread | |
|---|---|---|---|
| Memory per | ~10 MB | ~1 MB | ~2 KB |
| Spawn cost | ~ms | ~us | ~us |
| Switch cost | ~10 us | ~5 us | ~200 ns |
| Memory isolation | full | shares with siblings | shares with siblings |
| Max practical | thousands | ~10K | millions |
The cost difference is the whole story. A web server that spawns one OS thread per request caps out around 10K connections. The same server using goroutines or virtual threads handles a million on the same hardware.
Why it matters
Picking the wrong unit costs orders of magnitude:
- A Python script using threads for CPU-bound work runs slower than the single-threaded version (GIL serialises bytecode).
- A Java service that wraps every request in a platform thread runs out of memory at ~10K concurrent requests.
- A Go service that spawns goroutines unboundedly under load can OOM (cheap to spawn ≠ free; the rate must still be bounded).
Interviewers ask about this because the answer reveals whether the candidate understands the runtime in use or just memorised API names.
The M:N trick
Goroutines and virtual threads use M:N scheduling: M user-space tasks run on top of N OS threads. Whenever a task blocks (waiting on a channel, a lock, a syscall), the runtime parks it and reuses the OS thread to run another task. When the blocked task becomes runnable again, the runtime picks any available OS thread to resume it.
Without M:N (thread-per-request): With M:N (goroutine-per-request):
10K requests = 10K OS threads 10K requests = 10K goroutines
on ~8 OS threads
~10 GB of RAM just for stacks ~20 MB of RAM
Most threads parked in kernel Goroutines parked in user space
Kernel scheduler thrashes Runtime keeps cores busy
That's why "thread per request" is back in fashion via virtual threads (Java 21+). The model is the same; the cost is finally manageable.
Python is different
The CPython interpreter has a Global Interpreter Lock. Only one Python thread runs Python code at a time, regardless of core count. Threads still help for I/O (the GIL is released during blocking syscalls, NumPy ops, etc.), but CPU-bound parallelism requires separate processes (multiprocessing) or a rewrite of the hot path in C/Rust. asyncio is the third option, single-threaded cooperative concurrency.
When to reach for what
- CPU-bound work, Java/Go: pool of OS threads / goroutines sized to CPU count.
- I/O-bound work, Java: virtual threads (or async for older Java).
- I/O-bound work, Python:
asynciofor high concurrency,threadingfor moderate. - CPU-bound work, Python:
multiprocessingor rewrite hot path in C/Rust. - Hard isolation (crash containment, security boundary): separate processes.
The most expensive bug A leaked goroutine or thread doesn't crash the program, it slowly drains memory and file descriptors until something else does. Always know how every spawned concurrent task will exit.
Primitives by language
- java.lang.Thread (platform thread, ~1 MB stack)
- Thread.ofVirtual() (Java 21+, M:N scheduled)
- Runnable / Callable
- Executors.newVirtualThreadPerTaskExecutor()
Implementations
A platform thread is a thin wrapper around an OS thread (~1 MB stack, kernel-scheduled). A virtual thread is a JVM-managed user-space thread parked on a small pool of carrier threads, same API, vastly cheaper. Spawning a million is realistic.
1 // Platform thread, backed by an OS thread
2 Thread platform = new Thread(() -> {
3 System.out.println("Platform: " + Thread.currentThread());
4 });
5 platform.start();
6 platform.join();
7
8 // Virtual thread, JVM-managed (Java 21+)
9 Thread virtual = Thread.ofVirtual().start(() -> {
10 System.out.println("Virtual: " + Thread.currentThread());
11 });
12 virtual.join();newVirtualThreadPerTaskExecutor() makes the request-per-thread model viable again. Each task gets its own virtual thread; blocking calls park the virtual thread without blocking a carrier.
1 try (var executor = Executors.newVirtualThreadPerTaskExecutor()) {
2 IntStream.range(0, 1_000_000).forEach(i ->
3 executor.submit(() -> {
4 Thread.sleep(Duration.ofSeconds(1));
5 return i;
6 })
7 );
8 } // auto-closed: waits for all tasks to finishKey points
- •Process: isolated address space, expensive context switch (~tens of μs), IPC required to share data
- •OS thread: ~1 MB stack, context switch ~1–10 μs, kernel-scheduled
- •Goroutine: ~2 KB initial stack (grows), context switch ~hundreds of ns, runtime-scheduled
- •Java virtual threads (21+): JVM-managed, M:N model, millions per JVM
- •Python threads share the GIL, only one executes Python bytecode at a time
- •Prefer processes for CPU-bound parallelism in Python; threads for I/O
Tradeoffs
| Option | Pros | Cons | When to use |
|---|---|---|---|
| OS Thread (Java platform / Python threading.Thread) |
|
| CPU-bound work in Java; I/O-bound work in Python |
| Virtual Thread (Java 21+) |
|
| I/O-bound, high concurrency, request-per-thread servers |
| Goroutine |
|
| Default unit of concurrency in Go |
| asyncio.Task / Process |
|
| asyncio for I/O-heavy services; multiprocessing for CPU-bound Python |
Follow-up questions
▸How many goroutines can a Go program run?
▸What's the GIL and why does it matter?
▸Difference between virtual threads and goroutines?
▸Why does Go limit OS threads but not goroutines?
▸Can a goroutine outlive the function that started it?
Gotchas
- !Goroutines started in a loop without context cancellation are the #1 source of leaks in production Go services
- !Virtual threads pin to their carrier inside synchronized blocks, prefer ReentrantLock for I/O-heavy code
- !Python threads do NOT give CPU parallelism, measure and switch to multiprocessing if compute-bound
- !Mixing asyncio and threading is a footgun, use asyncio.to_thread() for blocking calls
- !runtime.NumGoroutine() growing unbounded over time = leak; alert on it
Common pitfalls
- Confusing concurrency (multiple tasks in progress) with parallelism (multiple tasks running simultaneously)
- Assuming Python threads parallelize CPU work, they don't, the GIL prevents it
- Using multiprocessing for I/O, process overhead dwarfs the I/O wait
APIs worth memorising
- Java: Thread.ofVirtual(), Executors.newVirtualThreadPerTaskExecutor()
- Python: threading.Thread, multiprocessing.Process, asyncio.create_task(), concurrent.futures
- Go: go keyword, runtime.GOMAXPROCS, runtime.NumGoroutine, context.Context
Every modern Java/Go service. Spring Boot 3.2+ supports virtual threads via spring.threads.virtual.enabled=true (opt-in, not default). Go's net/http spawns a goroutine per connection. Python web servers (FastAPI on uvicorn, Django on gunicorn-async) use asyncio for I/O concurrency.