Java ConcurrencyTopic 10 of 16

LanguageJavaIntermediateAsked Often

ConcurrentHashMap

In one line

Java's general-purpose concurrent map. Lock-free reads, CAS-based and per-bin locking on writes. Use computeIfAbsent and merge for atomic compound operations. Far better than synchronized HashMap or Collections.synchronizedMap for everything except trivial workloads.

What it is

ConcurrentHashMap (CHM) is the default thread-safe map in Java. Other options exist (Collections.synchronizedMap, Hashtable, hand-rolled sharded maps), but in modern code the answer is almost always CHM.

A picture of the internals

A hash table is a row of buckets. Each bucket holds the entries whose hash lands there. In a ConcurrentHashMap every bucket has its own tiny lock that gets taken only when a write actually touches that bucket and released immediately after. Reads do not lock anything at all.

Pre-Java-8 CHM used 16 fixed segments, each with one lock that covered many buckets at once. Java 8 replaced that with one lock per bucket, which is far finer-grained. The segment-level coarse lock is gone; the only contention is between two writes that hash to the same bucket.

Two practical consequences fall out of this design.

Operation	Locking	Scaling
Reads (`get`, `containsKey`)	No locks at all	Scale linearly with cores; readers never block each other or block writers
Writes to different buckets	Each takes its own bucket lock	Run truly in parallel; the locks are independent
Writes to the same bucket	Both contend on that one bucket lock	Serialise against each other only; everything else continues

This is why CHM scales the way it does in practice. A write to "alice" does not block a write to "bob" because they live in different buckets. Reads of either key do not block anything at all. Contention only appears when two writers hash to the same bucket, which on a well-sized table is rare.

The essentials

Three method families matter:

Single-call atomics: get, put, remove, containsKey. Each is thread-safe on its own. Compound operations (check-then-put) are not.

Atomic compounds: putIfAbsent, computeIfAbsent, computeIfPresent, compute, merge, replace. These are the methods to reach for instead of if (!map.containsKey(k)) map.put(...). Same atomicity, half the lines, no race.

Bulk operations: forEach, reduce, search with a parallelism threshold. Useful for monitoring scans across very large maps.

The two big patterns

Per-key counters via merge:

counts.merge(key, 1L, Long::sum);

Atomic, no AtomicLong needed for plain longs.

Lazy per-key initialisation via computeIfAbsent:

sessions.computeIfAbsent(userId, id -> openSession(id));

Loader runs at most once per key. Other threads wait on the bin lock and get the cached value.

Together, those two cover most "I need a thread-safe map" use cases.

What to watch out for

computeIfAbsent holds the bin's lock while the function runs. If the function is slow (a network call, a heavy compute), everything on that bin serialises. For slow loaders, store a CompletableFuture in the map instead, so the bin lock is released immediately and concurrent waiters share the in-flight future.

Iterators are weakly consistent. For a snapshot, copy the map first.

size() is approximate. An exact count would require locking the whole map (which CHM does not expose), so the API discourages depending on exact size.

When to reach for something else

CHM is the right choice for almost every concurrent-map use case in application code. The exceptions:

With only one writer thread, a regular HashMap plus a volatile reference to it (swapped on update) can be faster.

For prefix queries, a tree fits better (CHM is not ordered). ConcurrentSkipListMap is the lock-free ordered alternative.

If reads vastly outnumber writes and writes are batchable, copy-on-write (CopyOnWriteArrayList-style) avoids any synchronisation on the read path. CHM is still very fast at reads, so the win is marginal.

For everything else, the answer is CHM.

Primitives by language

ConcurrentHashMap.put / get / remove (atomic per call)
computeIfAbsent / computeIfPresent / compute / merge (atomic compound)
putIfAbsent / replace / remove(key, value) (atomic checks)
newKeySet (concurrent set view backed by CHM)
forEach / reduce / search (parallel bulk operations)

Implementation

The atomic compound operations

computeIfAbsent is the single most important CHM method. It runs the function only if the key is missing, atomically. Use it instead of containsKey + put.

 1  ConcurrentHashMap<String, AtomicLong> counts = new ConcurrentHashMap<>();
 2  
 3  // BAD: check-then-act, two threads can both insert
 4  if (!counts.containsKey(key)) {
 5      counts.put(key, new AtomicLong());
 6  }
 7  counts.get(key).incrementAndGet();
 8  
 9  // GOOD: atomic compound
10  counts.computeIfAbsent(key, k -> new AtomicLong()).incrementAndGet();
11  
12  // Even cleaner with merge for plain counters:
13  ConcurrentHashMap<String, Long> simple = new ConcurrentHashMap<>();
14  simple.merge(key, 1L, Long::sum);

Cache with single-flight via computeIfAbsent

Concurrent calls for the same missing key only run the loader once. The function holds the bin's lock for the duration, so the loader must be fast. For slow loaders, use a separate single-flight library.

 1  ConcurrentHashMap<String, User> userCache = new ConcurrentHashMap<>();
 2  
 3  User getUser(String id) {
 4      return userCache.computeIfAbsent(id, k -> userRepo.fetch(k));
 5  }
 6  
 7  // WARNING: keep the function fast. While it runs, other puts to the
 8  // same bin block. For multi-second loaders, do this instead:
 9  
10  ConcurrentHashMap<String, CompletableFuture<User>> inflight = new ConcurrentHashMap<>();
11  CompletableFuture<User> getUserFast(String id) {
12      return inflight.computeIfAbsent(id, k ->
13          CompletableFuture.supplyAsync(() -> userRepo.fetch(k))
14      );
15  }

Concurrent set via newKeySet

ConcurrentHashMap.newKeySet() is the standard concurrent hash set. The visited set in a parallel BFS or web crawler. add(x) returns true if x was new, atomic by construction.

1  Set<String> visited = ConcurrentHashMap.newKeySet();
2  
3  void crawl(String url) {
4      if (visited.add(url)) {                     // returns true only first time
5          for (String link : fetch(url)) {
6              pool.submit(() -> crawl(link));
7          }
8      }
9  }

Bulk parallel operations

forEach, search, reduce come in parallel variants that take a parallelism threshold. Below the threshold, they run sequentially. Useful for maps with millions of entries that need a parallel scan without setting up streams.

1  ConcurrentHashMap<String, Long> traffic = new ConcurrentHashMap<>();
2  
3  // Run in parallel when the map has at least 10K entries
4  Long total = traffic.reduce(10_000L, (k, v) -> v, 0L, Long::sum);
5  
6  // Parallel forEach with side effects (use with care)
7  traffic.forEach(10_000L, (k, v) -> {
8      if (v > THRESHOLD) alert(k, v);
9  });

Key points

•Reads are completely lock-free. Writes lock per-bin (the linked list at one hash bucket).
•Java 8+ replaced segment locks with per-bin synchronisation + CAS. Resize is incremental and lock-free.
•containsKey + put is racy. Use putIfAbsent or computeIfAbsent.
•size() is approximate under concurrent mutation. mappingCount() returns long for huge maps.
•Iterators are weakly consistent: they reflect the state at some point during iteration, not a snapshot.

Follow-up questions

▸How is CHM different from Collections.synchronizedMap?

synchronizedMap wraps every method in a single object lock. Two threads calling get serialise. CHM has lock-free reads and per-bin write locking, so concurrent reads and writes to different keys do not interfere. CHM is dramatically faster under contention and the only choice for serious workloads.

▸Why is size() approximate?

Under concurrent mutation, the count is maintained across multiple per-bin counters (so writes do not all contend on one). On a size() call, CHM sums them in a snapshot that may not reflect concurrent updates that happened during the sum. For most uses (capacity hints, metrics) the approximation is fine. For exact count, freeze the map first.

▸Can CHM be used as a concurrent set?

Yes. Either use ConcurrentHashMap.newKeySet() (returns Set view), or use the map with values that get ignored (Boolean.TRUE). The newKeySet variant is cleaner and uses ConcurrentHashMap.KeySetView under the hood.

▸What is wrong with iterating a CHM?

Iterators are weakly consistent: they will not throw ConcurrentModificationException, but they may or may not reflect mutations made during iteration. They reflect the state at some point during the iteration, not a snapshot. Acceptable for monitoring, not for transactional reads.

Gotchas

!computeIfAbsent function should be fast and side-effect-free. While it runs, the bin is locked.
!Recursive update inside computeIfAbsent (calling map.put on the same map) can deadlock or throw.
!size() is O(N) and approximate. Cache it when called often.
!Default initial capacity is 16, default concurrency level was historically 16. In Java 8+ concurrencyLevel is just a hint.
!Null keys and null values are forbidden. Throws NullPointerException.

ConcurrentHashMap

In one line

What it is

A picture of the internals

Two practical consequences fall out of this design.

Operation	Locking	Scaling
Reads (`get`, `containsKey`)	No locks at all	Scale linearly with cores; readers never block each other or block writers
Writes to different buckets	Each takes its own bucket lock	Run truly in parallel; the locks are independent
Writes to the same bucket	Both contend on that one bucket lock	Serialise against each other only; everything else continues

The essentials

Three method families matter:

Single-call atomics: get, put, remove, containsKey. Each is thread-safe on its own. Compound operations (check-then-put) are not.

Bulk operations: forEach, reduce, search with a parallelism threshold. Useful for monitoring scans across very large maps.

The two big patterns

Per-key counters via merge:

counts.merge(key, 1L, Long::sum);

Atomic, no AtomicLong needed for plain longs.

Lazy per-key initialisation via computeIfAbsent:

sessions.computeIfAbsent(userId, id -> openSession(id));

Loader runs at most once per key. Other threads wait on the bin lock and get the cached value.

Together, those two cover most "I need a thread-safe map" use cases.

What to watch out for

Iterators are weakly consistent. For a snapshot, copy the map first.

size() is approximate. An exact count would require locking the whole map (which CHM does not expose), so the API discourages depending on exact size.

When to reach for something else

CHM is the right choice for almost every concurrent-map use case in application code. The exceptions:

With only one writer thread, a regular HashMap plus a volatile reference to it (swapped on update) can be faster.

For prefix queries, a tree fits better (CHM is not ordered). ConcurrentSkipListMap is the lock-free ordered alternative.

For everything else, the answer is CHM.

Primitives by language

ConcurrentHashMap.put / get / remove (atomic per call)
computeIfAbsent / computeIfPresent / compute / merge (atomic compound)
putIfAbsent / replace / remove(key, value) (atomic checks)
newKeySet (concurrent set view backed by CHM)
forEach / reduce / search (parallel bulk operations)

Implementation

The atomic compound operations

computeIfAbsent is the single most important CHM method. It runs the function only if the key is missing, atomically. Use it instead of containsKey + put.

 1  ConcurrentHashMap<String, AtomicLong> counts = new ConcurrentHashMap<>();
 2  
 3  // BAD: check-then-act, two threads can both insert
 4  if (!counts.containsKey(key)) {
 5      counts.put(key, new AtomicLong());
 6  }
 7  counts.get(key).incrementAndGet();
 8  
 9  // GOOD: atomic compound
10  counts.computeIfAbsent(key, k -> new AtomicLong()).incrementAndGet();
11  
12  // Even cleaner with merge for plain counters:
13  ConcurrentHashMap<String, Long> simple = new ConcurrentHashMap<>();
14  simple.merge(key, 1L, Long::sum);

Cache with single-flight via computeIfAbsent

 1  ConcurrentHashMap<String, User> userCache = new ConcurrentHashMap<>();
 2  
 3  User getUser(String id) {
 4      return userCache.computeIfAbsent(id, k -> userRepo.fetch(k));
 5  }
 6  
 7  // WARNING: keep the function fast. While it runs, other puts to the
 8  // same bin block. For multi-second loaders, do this instead:
 9  
10  ConcurrentHashMap<String, CompletableFuture<User>> inflight = new ConcurrentHashMap<>();
11  CompletableFuture<User> getUserFast(String id) {
12      return inflight.computeIfAbsent(id, k ->
13          CompletableFuture.supplyAsync(() -> userRepo.fetch(k))
14      );
15  }

Concurrent set via newKeySet

ConcurrentHashMap.newKeySet() is the standard concurrent hash set. The visited set in a parallel BFS or web crawler. add(x) returns true if x was new, atomic by construction.

1  Set<String> visited = ConcurrentHashMap.newKeySet();
2  
3  void crawl(String url) {
4      if (visited.add(url)) {                     // returns true only first time
5          for (String link : fetch(url)) {
6              pool.submit(() -> crawl(link));
7          }
8      }
9  }

Bulk parallel operations

1  ConcurrentHashMap<String, Long> traffic = new ConcurrentHashMap<>();
2  
3  // Run in parallel when the map has at least 10K entries
4  Long total = traffic.reduce(10_000L, (k, v) -> v, 0L, Long::sum);
5  
6  // Parallel forEach with side effects (use with care)
7  traffic.forEach(10_000L, (k, v) -> {
8      if (v > THRESHOLD) alert(k, v);
9  });

Key points

•Reads are completely lock-free. Writes lock per-bin (the linked list at one hash bucket).
•Java 8+ replaced segment locks with per-bin synchronisation + CAS. Resize is incremental and lock-free.
•containsKey + put is racy. Use putIfAbsent or computeIfAbsent.
•size() is approximate under concurrent mutation. mappingCount() returns long for huge maps.
•Iterators are weakly consistent: they reflect the state at some point during iteration, not a snapshot.

Follow-up questions

▸How is CHM different from Collections.synchronizedMap?

▸Why is size() approximate?

▸Can CHM be used as a concurrent set?

▸What is wrong with iterating a CHM?

Gotchas

!computeIfAbsent function should be fast and side-effect-free. While it runs, the bin is locked.
!Recursive update inside computeIfAbsent (calling map.put on the same map) can deadlock or throw.
!size() is O(N) and approximate. Cache it when called often.
!Default initial capacity is 16, default concurrency level was historically 16. In Java 8+ concurrencyLevel is just a hint.
!Null keys and null values are forbidden. Throws NullPointerException.

ConcurrentHashMap

What it is

A picture of the internals

The essentials

The two big patterns

What to watch out for

When to reach for something else

Primitives by language

Implementation

Key points

Follow-up questions

Gotchas

Related reading

ConcurrentHashMap

What it is

A picture of the internals

The essentials

The two big patterns

What to watch out for

When to reach for something else

Primitives by language

Implementation

Key points

Follow-up questions

Gotchas

Related reading