Go ConcurrencyTopic 1 of 13

LanguageGoBasicAsked Often

Goroutines, Spawning, Scheduling, Lifecycle

In one line

Goroutines are user-space tasks multiplexed onto a small pool of OS threads by the Go runtime's M:P:G scheduler. Spawn with `go func() {...}()`, ~2KB initial stack, cheap to create, millions are practical. No public state machine; lifetime is 'until the function returns.'

What goroutines actually are

A goroutine is a lightweight task, not an OS thread. The Go runtime maintains a small pool of OS threads (M's) and multiplexes goroutines (G's) onto them via logical processors (P's). When a goroutine blocks on a channel, lock, or syscall, the runtime parks it and schedules another goroutine on the same OS thread.

Note

Why this matters One OS thread = one process slot, one stack, one expensive context switch. One goroutine = one task struct, one tiny stack, one cheap user-space switch. Spawning 100K goroutines is normal in production Go services. Spawning 100K threads would crash any modern OS.

The M:P:G mental model

G (goroutine), each go func() call. Has its own stack, instruction pointer, function context.
P (processor), a logical processor. Holds a local run queue of G's. Number of P's = GOMAXPROCS (default = NumCPU).
M (machine), an OS thread. Picks a P, runs G's from P's queue. When G blocks on a syscall, M may release P to another M.

Work-stealing: an idle P steals half the queue from a busy P. Keeps load balanced even with imbalanced workloads.

Tip

What this means in practice

CPU-bound: GOMAXPROCS = NumCPU. Spawning more goroutines than P's just adds scheduling overhead.
I/O-bound: spawn lots; they park on the I/O, freeing M's for other goroutines. This is where Go's concurrency model shines.
Mixed: same as I/O-bound, let the runtime juggle.

The lifecycle truth

A goroutine has no public state. It's running, runnable, or parked. From the caller's perspective:

Started: go f(), runs eventually (no guarantee of immediate execution).
Running: doing work. May be parked-and-resumed many times.
Done: f returns. Goroutine vanishes. No way to get a return value (use a channel or sync.WaitGroup).

Important: a goroutine cannot be stopped from outside. The goroutine must voluntarily return. The cancellation pattern: pass a context.Context, have the goroutine select on ctx.Done(), and return when cancelled.

Warning

The leak that bites everyone A goroutine blocked on a channel send/receive with no cancellation path lives forever. runtime.NumGoroutine() growing under load is the unmistakable sign. The fix: every goroutine must have an exit story, context cancellation, channel close, or natural completion. Always know how every spawned goroutine will exit.

Primitives by language

go func() { ... }()
runtime.GOMAXPROCS / NumGoroutine / Gosched
sync.WaitGroup (joining)
context.Context (cancellation)

Implementation

Basics, go keyword + WaitGroup for join

go func() {...}() launches a goroutine that runs concurrently with the caller. To wait for it, use sync.WaitGroup, Add before go, Done inside, Wait to block.

 1  package main
 2  
 3  import (
 4      "fmt"
 5      "sync"
 6  )
 7  
 8  func main() {
 9      var wg sync.WaitGroup
10      for i := 0; i < 5; i++ {
11          wg.Add(1)
12          go func(id int) {
13              defer wg.Done()
14              fmt.Printf("worker %d running\n", id)
15          }(i)                     // pass i as arg, closure capture trap
16      }
17      wg.Wait()                    // block until all 5 finish
18  }

Closure capture pitfall

A classic trap: launching goroutines in a loop without capturing the loop variable as an argument. All goroutines share i and most print the same final value. Always pass loop variables as args. Note: Go 1.22+ fixes the loop-var semantics, but explicit capture is the portable habit.

 1  package main
 2  
 3  import "sync"
 4  
 5  func main() {
 6      // BROKEN (pre-Go 1.22), all goroutines share i
 7      for i := 0; i < 5; i++ {
 8          go func() {
 9              fmt.Println(i)         // probably prints 5,5,5,5,5
10          }()
11      }
12  
13      // FIXED, capture as arg
14      var wg sync.WaitGroup
15      for i := 0; i < 5; i++ {
16          wg.Add(1)
17          go func(id int) {           // id is per-goroutine
18              defer wg.Done()
19              fmt.Println(id)        // 0,1,2,3,4 (in some order)
20          }(i)
21      }
22      wg.Wait()
23  }

Inspecting the runtime

runtime.NumGoroutine() is the first diagnostic for leaks. GOMAXPROCS is the parallelism cap (default = num cores). Gosched() voluntarily yields, rarely needed, but useful in tight loops without blocking calls.

 1  package main
 2  
 3  import (
 4      "fmt"
 5      "runtime"
 6  )
 7  
 8  func main() {
 9      fmt.Println("GOMAXPROCS:", runtime.GOMAXPROCS(0))
10      fmt.Println("NumCPU:", runtime.NumCPU())
11      fmt.Println("Goroutines:", runtime.NumGoroutine())   // typically 1 (main)
12  
13      go func() {
14          for i := 0; i < 1_000_000; i++ {
15              // tight loop, without channel/lock, scheduler can't preempt prior to Go 1.14
16              if i%100_000 == 0 {
17                  runtime.Gosched()                          // explicit yield
18              }
19          }
20      }()
21      fmt.Println("After spawn:", runtime.NumGoroutine())   // 2
22  }

Concurrent fan-out, bounded parallelism

Spawning a goroutine per item works for small N. For large N, bound concurrency with a buffered channel as a semaphore.

 1  package main
 2  
 3  import "sync"
 4  
 5  func processAll(items []int, maxConcurrent int) {
 6      sem := make(chan struct{}, maxConcurrent)
 7      var wg sync.WaitGroup
 8  
 9      for _, item := range items {
10          wg.Add(1)
11          sem <- struct{}{}             // acquire, blocks if full
12          go func(i int) {
13              defer wg.Done()
14              defer func() { <-sem }()  // release
15              process(i)
16          }(item)
17      }
18      wg.Wait()
19  }

Key points

•Goroutines are NOT OS threads, multiplexed onto N OS threads where N = GOMAXPROCS
•Stack starts at ~2KB, grows on demand up to ~1GB
•M:P:G model: M = OS thread, P = logical processor, G = goroutine
•Work stealing: idle P's steal goroutines from busy P's run queues
•No stop/kill, goroutines exit only when their function returns

Follow-up questions

▸What's M:P:G scheduling?

M = machine (OS thread). P = processor (logical, count = GOMAXPROCS). G = goroutine. Each P has a run queue of G's. M's pick a G from their P's queue and execute it. Idle M's steal from other P's. The runtime parks G's on blocking calls (channel, lock, syscall) without blocking the underlying M.

▸How are goroutines cheaper than OS threads?

(1) Stack: 2KB initial vs 1MB; (2) Context switch: user-space, no kernel transition, ~hundreds of ns vs ~5μs; (3) Scheduling: Go scheduler optimized for many goroutines; OS scheduler optimized for few threads. Net result: 100K goroutines is fine; 100K threads would crash.

▸Can a goroutine outlive main?

Yes, until main returns. Once main returns, the runtime exits and kills all goroutines (no cleanup, no defers). Use sync.WaitGroup or context.Context to wait for them before main returns.

▸How is a goroutine stopped?

Not from outside. The goroutine itself must return. Pass a `context.Context` and have the goroutine select on `ctx.Done()`, when the context is cancelled, the goroutine exits voluntarily. There's no `goroutine.Stop()`.

Gotchas

!Closure capture in loops (pre-Go 1.22): always pass loop vars as args
!wg.Add() must be BEFORE 'go ...' or it races with wg.Wait()
!Spawning unbounded goroutines = potential OOM under load
!tight pure-CPU loops without yields could starve the scheduler before Go 1.14

Related reading

What goroutines actually are

Note

The M:P:G mental model

G (goroutine), each go func() call. Has its own stack, instruction pointer, function context.

P (processor), a logical processor. Holds a local run queue of G's. Number of P's = GOMAXPROCS (default = NumCPU).

M (machine), an OS thread. Picks a P, runs G's from P's queue. When G blocks on a syscall, M may release P to another M.

Work-stealing: an idle P steals half the queue from a busy P. Keeps load balanced even with imbalanced workloads.

Tip

What this means in practice

CPU-bound: GOMAXPROCS = NumCPU. Spawning more goroutines than P's just adds scheduling overhead.
I/O-bound: spawn lots; they park on the I/O, freeing M's for other goroutines. This is where Go's concurrency model shines.
Mixed: same as I/O-bound, let the runtime juggle.

The lifecycle truth

A goroutine has no public state. It's running, runnable, or parked. From the caller's perspective:

Started: go f(), runs eventually (no guarantee of immediate execution).

Running: doing work. May be parked-and-resumed many times.

Done: f returns. Goroutine vanishes. No way to get a return value (use a channel or sync.WaitGroup).

Warning

1 package main 2 3 import ( 4 "fmt" 5 "sync" 6 ) 7 8 func main() { 9 var wg sync.WaitGroup 10 for i := 0; i < 5; i++ { 11 wg.Add(1) 12 go func(id int) { 13 defer wg.Done() 14 fmt.Printf("worker %d running\n", id) 15 }(i) // pass i as arg, closure capture trap 16 } 17 wg.Wait() // block until all 5 finish 18 }

1 package main 2 3 import "sync" 4 5 func main() { 6 // BROKEN (pre-Go 1.22), all goroutines share i 7 for i := 0; i < 5; i++ { 8 go func() { 9 fmt.Println(i) // probably prints 5,5,5,5,5 10 }() 11 } 12 13 // FIXED, capture as arg 14 var wg sync.WaitGroup 15 for i := 0; i < 5; i++ { 16 wg.Add(1) 17 go func(id int) { // id is per-goroutine 18 defer wg.Done() 19 fmt.Println(id) // 0,1,2,3,4 (in some order) 20 }(i) 21 } 22 wg.Wait() 23 }

1 package main 2 3 import ( 4 "fmt" 5 "runtime" 6 ) 7 8 func main() { 9 fmt.Println("GOMAXPROCS:", runtime.GOMAXPROCS(0)) 10 fmt.Println("NumCPU:", runtime.NumCPU()) 11 fmt.Println("Goroutines:", runtime.NumGoroutine()) // typically 1 (main) 12 13 go func() { 14 for i := 0; i < 1_000_000; i++ { 15 // tight loop, without channel/lock, scheduler can't preempt prior to Go 1.14 16 if i%100_000 == 0 { 17 runtime.Gosched() // explicit yield 18 } 19 } 20 }() 21 fmt.Println("After spawn:", runtime.NumGoroutine()) // 2 22 }

1 package main 2 3 import "sync" 4 5 func processAll(items []int, maxConcurrent int) { 6 sem := make(chan struct{}, maxConcurrent) 7 var wg sync.WaitGroup 8 9 for _, item := range items { 10 wg.Add(1) 11 sem <- struct{}{} // acquire, blocks if full 12 go func(i int) { 13 defer wg.Done() 14 defer func() { <-sem }() // release 15 process(i) 16 }(item) 17 } 18 wg.Wait() 19 }