Sync Utilities: Latch, Barrier, Semaphore, Phaser, Exchanger
Five coordination primitives in java.util.concurrent. CountDownLatch: one-shot start signal. CyclicBarrier: rendezvous N threads, reusable. Semaphore: limit concurrent access to N permits. Phaser: dynamic-party barrier. Exchanger: hand off a value between two threads.
Five primitives, five different shapes
The java.util.concurrent package ships five coordination utilities beyond the basic locks and atomics. Each one solves a different shape of "wait for something". Knowing which one fits a given problem avoids hand-rolling with wait and notify, which is where the bugs live.
| Primitive | Shape it solves |
|---|---|
CountDownLatch | One-shot countdown to zero. Threads wait until the count hits zero, then it stays there forever. |
CyclicBarrier | N threads rendezvous, an optional action runs, then the barrier resets and everyone goes again. |
Semaphore | Pool of N permits. Take a permit to enter, release a permit when done. |
Phaser | Multi-phase barrier where the number of parties can change between phases. |
Exchanger | Two threads meet and swap values. |
CountDownLatch: wait until ready
A counter that starts at N and counts down. Any thread that calls await() blocks until the count reaches zero. Once at zero, the latch stays there forever; it cannot be reset.
Two patterns cover almost every use:
- Start gate. Latch at 1. Worker threads
awaiton it. A setup thread does the warm-up work and callscountDownexactly once. Every waiting worker unblocks at the same moment. Useful for "wait until services are warm before opening traffic". - Done gate. Latch at N. Each of N workers calls
countDownwhen finished. The main thread callsawaitand continues only when all workers are done.
When the work is already represented as CompletableFuture instances, CompletableFuture.allOf(...).join() is cleaner than a done-gate latch. Reach for CountDownLatch only when futures are not the natural shape.
CyclicBarrier: all of us, then go, then again
Like a latch, but reusable. N threads each call await(). Once all N are present, an optional barrier action runs once, then the barrier resets to its starting count and every thread continues. The next call to await from any party starts the next round.
This is the right primitive for iterative parallel algorithms: each thread computes its slice, everyone meets at the barrier, the merge step runs, the barrier resets, the loop continues. The barrier action sees the result of every thread's slice, so it is the natural place to combine results.
If any waiting thread is interrupted, or the merge action itself throws an exception, the barrier "breaks": every other thread that was waiting gets a BrokenBarrierException. Production code that uses a barrier should plan for it.
Semaphore: N permits in a pool
A counter of permits. acquire() blocks until a permit is available, release() returns one. Fair mode preserves arrival order at the cost of more context switches; unfair (the default) is faster but can let a steady stream of new arrivals starve a long-waiting thread.
Three common uses:
- Pool of N resources. A connection pool, a worker pool. Acquire to borrow, release in a
finallyblock. - Bounded concurrent execution. "No more than ten outbound HTTP calls in flight at once."
- Signal style. A semaphore initialised at zero. One thread releases when an event happens; another thread is parked on
acquirewaiting for it. Other languages have a dedicatedEventprimitive for this; Java does not, so the semaphore-as-event idiom shows up.
Always pair acquire with release inside a try/finally. A permit leaked due to an exception path is gone for good; the only fix at runtime is to recreate the semaphore.
Phaser: barrier with a changing number of parties
The flexible cousin of CyclicBarrier. Parties can register() and arriveAndDeregister() between phases, so the number of threads participating can grow or shrink as the algorithm progresses. The Phaser also exposes a phase number, so threads can ask "which phase are we on" and run different code per phase.
This is the right primitive when worker counts vary by phase. A common case is tree-recursive parallel work that spawns helpers for one phase and reduces to one thread in the next. For most workloads where the count is fixed up front, CyclicBarrier is simpler and just as good.
Most teams never need Phaser. CyclicBarrier plus CompletableFuture covers almost every coordination shape that comes up in normal application code.
Exchanger: two threads, swap values
Two threads, both call exchange(value), and each receives the other's value. That is the entire API. Whichever thread arrives first parks until the other one shows up; once both are present, the swap happens atomically and both threads continue.
The most useful application is double-buffering. The producer fills one buffer while the consumer drains another, and when both are done they trade: the producer gets back an empty buffer to refill, the consumer gets a freshly-filled buffer to drain. The two threads never wait for each other in the middle of the work, only at the swap point at the end of each cycle.
A worked example with two buffers, A and B, swapping back and forth.
The trick is that the work between exchanges runs in parallel: while the producer fills B, the consumer drains A, neither one blocks the other until they both finish the cycle and want to trade again. That is the win over a BlockingQueue for this specific shape; with a queue the producer can only ever hand items to the consumer, never trade buffers back. Outside double-buffering, BlockingQueue is almost always a cleaner option for one-way handoff.
How to choose
The decision is almost always settled by stripping the problem down to its coordination shape and matching it to the table at the top.
| Problem in plain words | Right primitive |
|---|---|
| "Wait once for N things to finish" | CountDownLatch (done gate) |
| "Wait until something is ready, then everyone runs" | CountDownLatch (start gate) |
| "Loop forever with a synchronization point each iteration" | CyclicBarrier |
| "Limit concurrent access to a resource to at most N" | Semaphore |
| "Coordinate threads across phases where the count changes" | Phaser |
| "Two threads, hand off a value each way" | Exchanger |
When none of those fit cleanly, the answer is usually CompletableFuture (for "compute and combine") or BlockingQueue (for "one direction handoff with backpressure"). These five primitives are the sharp tools; the higher-level utilities in java.util.concurrent cover most application code more clearly.
Primitives by language
- CountDownLatch (one-shot countdown to zero)
- CyclicBarrier (N parties rendezvous, then continue)
- Semaphore (N permits, fair or unfair)
- Phaser (multi-phase, dynamically registered parties)
- Exchanger (paired thread value swap)
Implementation
Worker threads block on the latch until all setup is done. Setup thread counts down once, every worker unblocks. Common pattern for "wait until N services have warmed up" before opening traffic.
1 import java.util.concurrent.CountDownLatch;
2
3 CountDownLatch ready = new CountDownLatch(1);
4
5 // Setup thread
6 new Thread(() -> {
7 loadConfig();
8 warmCache();
9 openConnections();
10 ready.countDown(); // signal: setup done
11 }).start();
12
13 // Worker threads
14 for (int i = 0; i < 8; i++) {
15 new Thread(() -> {
16 ready.await(); // wait until countdown hits 0
17 processRequests();
18 }).start();
19 }
20
21 // Variant: wait for N workers to finish
22 CountDownLatch done = new CountDownLatch(8);
23 for (int i = 0; i < 8; i++) {
24 executor.submit(() -> {
25 try { work(); } finally { done.countDown(); }
26 });
27 }
28 done.await(); // main blocks until all 8 doneEach iteration, all worker threads compute their share, hit the barrier, the optional barrier action runs once, then everyone goes around again. Used for parallel iterative algorithms (BFS, gradient descent, simulation steps).
1 import java.util.concurrent.CyclicBarrier;
2
3 int parties = 4;
4 CyclicBarrier barrier = new CyclicBarrier(parties, () -> {
5 // Runs once after all 4 arrive, before any is released
6 mergeIntoGlobalGrid();
7 System.out.println("Step complete");
8 });
9
10 for (int i = 0; i < parties; i++) {
11 int worker = i;
12 new Thread(() -> {
13 for (int step = 0; step < 100; step++) {
14 computePartition(worker, step);
15 try {
16 barrier.await(); // wait for the other 3
17 } catch (Exception e) { return; }
18 }
19 }).start();
20 }Limit how many threads can be in a section at once. Connection pools, rate limiters, "no more than 10 outbound HTTP calls in flight". Fair semaphore preserves arrival order; unfair is faster but can starve.
1 import java.util.concurrent.Semaphore;
2
3 Semaphore connections = new Semaphore(10, true); // 10 permits, fair
4
5 void useDb() {
6 connections.acquireUninterruptibly();
7 try {
8 try (Connection c = pool.borrow()) {
9 query(c);
10 }
11 } finally {
12 connections.release();
13 }
14 }
15
16 // Try-acquire with timeout for latency-sensitive code
17 void tryUseDb() throws InterruptedException {
18 if (connections.tryAcquire(50, TimeUnit.MILLISECONDS)) {
19 try { /* ... */ } finally { connections.release(); }
20 } else {
21 throw new RejectedExecutionException("DB busy");
22 }
23 }Like CyclicBarrier but parties can register and deregister between phases. Useful when worker count changes between steps (e.g., one task spawns subtasks that join the next phase).
1 import java.util.concurrent.Phaser;
2
3 Phaser phaser = new Phaser(1); // self-register
4
5 for (int i = 0; i < 4; i++) {
6 phaser.register(); // each worker joins
7 new Thread(() -> {
8 for (int step = 0; step < 5; step++) {
9 doWork(step);
10 phaser.arriveAndAwaitAdvance(); // wait for phase
11 }
12 phaser.arriveAndDeregister(); // leave for good
13 }).start();
14 }
15
16 phaser.arriveAndDeregister(); // main leavesTwo threads meet at exchange(). Each gets the other's value. Useful for double-buffering: producer fills buffer A while consumer drains buffer B, swap when both done.
1 import java.util.concurrent.Exchanger;
2
3 Exchanger<List<String>> exchanger = new Exchanger<>();
4
5 Thread producer = new Thread(() -> {
6 List<String> buffer = new ArrayList<>();
7 while (true) {
8 fill(buffer);
9 try {
10 buffer = exchanger.exchange(buffer); // swap with consumer
11 } catch (InterruptedException e) { return; }
12 }
13 });
14
15 Thread consumer = new Thread(() -> {
16 List<String> buffer = new ArrayList<>();
17 while (true) {
18 try {
19 buffer = exchanger.exchange(buffer); // get producer's full buffer
20 } catch (InterruptedException e) { return; }
21 drain(buffer);
22 }
23 });Key points
- •CountDownLatch is one-shot. Once it hits zero, it stays zero. Cannot reset.
- •CyclicBarrier is reusable. After N threads arrive, it triggers an optional action and resets.
- •Semaphore.acquire() blocks for a permit. Use it for connection pools, rate limiters, bounded concurrent execution.
- •Phaser is the flexible cousin of CyclicBarrier: parties can register and deregister mid-flight.
- •Exchanger pairs exactly two threads. Each calls exchange(value) and gets the other's value.
Follow-up questions
▸When to use CountDownLatch vs CyclicBarrier?
▸What does 'fair' mean for Semaphore?
▸Phaser vs CyclicBarrier?
▸Are these primitives ever the wrong tool?
Gotchas
- !CountDownLatch cannot be reset. For a reusable variant, use CyclicBarrier or Phaser.
- !Forgetting to release a Semaphore permit (no try/finally) leaks permits permanently
- !CyclicBarrier with a barrier action: that action throws? All waiting threads get BrokenBarrierException.
- !Unfair Semaphore can starve a thread under sustained contention.
- !Phaser deregistration is one-shot; deregistering twice from one thread will desync the count.