Debugging ConcurrencyTopic 7 of 18

ProblemIntermediateSometimes

Bug Hunt: Why Does My RWMutex Deadlock When I Re-enter?

In one line

RWMutex is not reentrant. A goroutine holding the read lock that calls a method which also tries to acquire the read lock can deadlock if a writer is waiting in between. Java's ReentrantReadWriteLock is reentrant by design; sync.RWMutex is intentionally not. The fix is to restructure so locks aren't reentered.

The puzzle

A cache that supports get(key) and a getCount() method that totals all values. Both methods take the read lock. Looks fine. Tests pass. Under low load, production looks fine. Then a write operation happens during a getCount() and the whole service hangs.

What's special about the read-then-read-during-write interleaving?

Important

The single-thread surprise The deadlock isn't between two different threads competing for two locks. It's a single thread re-acquiring the same lock with a writer waiting in between. People don't expect a deadlock in single-threaded re-entry, but the writer-preference rule of RWMutex makes it happen.

What to look for in the broken code

Read the language tab. The suspicious pattern: a function takes the lock, then calls another method on the same struct that also takes the lock. Without writers, this is fine on Go's RWMutex (multiple readers allowed). With writers in the mix:

Goroutine A calls getCount() → acquires RLock #1.
Goroutine B calls Set() → calls Lock() → blocks because RLock is held.
Goroutine A's loop iterates → calls getOrLoad() → tries RLock #2.
RLock #2 blocks because writer is queued (writer-pref).
Both goroutines are blocked. Deadlock.

Note

Why writer-pref exists If new readers could acquire while a writer is queued, writers could starve forever (steady stream of readers = writer never gets the lock). Writer-pref breaks the cycle by saying: once a writer waits, no new readers may enter. It also creates this re-entry deadlock as a side effect.

The fix patterns

Approach	When
Use a reentrant lock	Java's `ReentrantReadWriteLock`, Python's `RLock`, easiest. Small overhead.
Internal "Locked" helpers	Method named `getXLocked()` that assumes the caller holds the lock. No re-entry.
Snapshot + unlock	Acquire lock, copy data, release, then process. Best when the inner work is heavy.

Tip

Best practice Prefer the internal helper pattern. It makes the lock contract explicit at every call site (xxxLocked means "the caller must hold the lock"). Reentrant locks paper over the design issue without fixing it.

How to spot this in code review

Warning

The smell When one method takes a lock and calls another method on the same object that also takes a lock, pause. Ask:

Is the lock reentrant? (Go RWMutex: no. Python Lock: no. Java ReentrantLock: yes.)
If non-reentrant, can a writer arrive between the two reads? (Almost always yes in production.)
Can the code be refactored to avoid re-entry?

The deadlock is invisible until the right load arrives. Catch it in review, not at 3 a.m.

Implementations

BROKEN, using a non-reentrant lock for the same pattern

ReentrantReadWriteLock IS reentrant by design (the name says so). But many Java engineers reach for plain ReentrantLock thinking "all Java locks are reentrant", wrong. Semaphore.acquire() is NOT reentrant. StampedLock is NOT reentrant. Same trap, different primitive.

 1  class Cache {
 2      private final StampedLock lock = new StampedLock();   // NOT reentrant
 3      private final Map<String, Integer> data = new HashMap<>();
 4  
 5      int getOrLoad(String key) {
 6          long stamp = lock.readLock();                     // ← second readLock
 7          try {
 8              return data.getOrDefault(key, 0);
 9          } finally { lock.unlockRead(stamp); }
10      }
11  
12      int getCount() {
13          long stamp = lock.readLock();                     // ← first readLock
14          try {
15              int total = 0;
16              for (String k : data.keySet()) {
17                  total += getOrLoad(k);                    // ← StampedLock NOT reentrant
18              }                                              //   deadlocks under writer
19              return total;
20          } finally { lock.unlockRead(stamp); }
21      }
22  }

FIXED, use ReentrantReadWriteLock OR refactor

The fix: choose a reentrant lock if the pattern is needed, OR refactor to lock once. ReentrantReadWriteLock allows the same thread to re-acquire, at a small overhead cost. StampedLock is faster but doesn't allow reentry. Pick based on whether the code needs reentrant semantics.

 1  // Fix #1, use ReentrantReadWriteLock (reentrant by design)
 2  class Cache {
 3      private final ReentrantReadWriteLock lock = new ReentrantReadWriteLock();
 4      private final Map<String, Integer> data = new HashMap<>();
 5  
 6      int getOrLoad(String key) {
 7          lock.readLock().lock();
 8          try {
 9              return data.getOrDefault(key, 0);
10          } finally { lock.readLock().unlock(); }
11      }
12  
13      int getCount() {
14          lock.readLock().lock();
15          try {
16              int total = 0;
17              for (String k : data.keySet()) {
18                  total += getOrLoad(k);                    // OK, reentrant
19              }
20              return total;
21          } finally { lock.readLock().unlock(); }
22      }
23  }
24  
25  // Fix #2, refactor with internal "already-locked" helpers
26  class Cache {
27      private final StampedLock lock = new StampedLock();
28      private final Map<String, Integer> data = new HashMap<>();
29  
30      private int getOrLoadLocked(String key) {              // caller must hold readLock
31          return data.getOrDefault(key, 0);
32      }
33  
34      int getCount() {
35          long stamp = lock.readLock();
36          try {
37              int total = 0;
38              for (String k : data.keySet()) total += getOrLoadLocked(k);
39              return total;
40          } finally { lock.unlockRead(stamp); }
41      }
42  }

Key points

•sync.RWMutex (Go): NOT reentrant, RLock-then-Lock or RLock-then-RLock can deadlock
•ReentrantReadWriteLock (Java): IS reentrant, same thread can re-acquire its own lock
•Common trap: reader calls a method that also reads; a writer arrives between them; deadlock
•Fix: restructure to acquire the lock once at the outer scope, or pass an already-locked context

Follow-up questions

▸Why is sync.RWMutex (Go) intentionally non-reentrant?

Reentrant locks hide design issues. Go's authors argue: code that needs reentrant locks is calling itself unsafely. The right answer is to refactor, extract the locked work into internal helpers or release-and-reacquire. The non-reentrant default forces good design.

▸When does the RWMutex deadlock actually fire?

When a writer is queued between two readlocks of the same goroutine. Without writers, two consecutive RLocks 'work' (RWMutex allows multiple readers). With a writer waiting, the second RLock blocks because writer-pref forbids new readers. The first RLock can't release because the function is still inside it. → deadlock.

▸Should ReentrantLock be the default?

Reentrant locks are easier to compose but pay a small cost (tracking the holder thread + count). For tight, performance-critical code with no nested calls, the non-reentrant variant is fine. Default to ReentrantLock unless the overhead has been measured and matters.

▸How does this differ from a normal deadlock?

Normal deadlock: two threads, two locks, opposite order. Reentrant deadlock: ONE thread re-acquiring the same lock with a non-reentrant primitive, when a writer (other thread) is waiting. Less obvious because there are not two distinct lock instances.

Gotchas

!Go: sync.RWMutex is writer-preferring, once Lock() is called, new RLocks block
!Java: StampedLock and Semaphore are NOT reentrant; ReentrantLock and ReentrantReadWriteLock are
!Python: threading.Lock is NOT reentrant; threading.RLock IS
!Calling external code (callbacks, listeners) while holding a lock can introduce reentrancy
!ReentrantLock is fine for nested calls; using one to call into completely unknown code is still risky (might block on something else)

Where this shows up

Cache implementations are the #1 victim. Anything where a 'getter' calls another 'getter' on the same struct, both protected by RW locks, eventually hits this. The Go stdlib explicitly recommends not designing this pattern.

Related reading

The puzzle

What's special about the read-then-read-during-write interleaving?

Important

What to look for in the broken code

Goroutine A calls getCount() → acquires RLock #1.

Goroutine B calls Set() → calls Lock() → blocks because RLock is held.

Goroutine A's loop iterates → calls getOrLoad() → tries RLock #2.

RLock #2 blocks because writer is queued (writer-pref).

Both goroutines are blocked. Deadlock.

Note

The fix patterns

Approach

When

Use a reentrant lock

Java's ReentrantReadWriteLock, Python's RLock, easiest. Small overhead.

Internal "Locked" helpers

Method named getXLocked() that assumes the caller holds the lock. No re-entry.

Snapshot + unlock

Acquire lock, copy data, release, then process. Best when the inner work is heavy.

Tip

How to spot this in code review

Warning

The smell When one method takes a lock and calls another method on the same object that also takes a lock, pause. Ask:

Is the lock reentrant? (Go RWMutex: no. Python Lock: no. Java ReentrantLock: yes.)
If non-reentrant, can a writer arrive between the two reads? (Almost always yes in production.)
Can the code be refactored to avoid re-entry?

The deadlock is invisible until the right load arrives. Catch it in review, not at 3 a.m.

1 class Cache { 2 private final StampedLock lock = new StampedLock(); // NOT reentrant 3 private final Map<String, Integer> data = new HashMap<>(); 4 5 int getOrLoad(String key) { 6 long stamp = lock.readLock(); // ← second readLock 7 try { 8 return data.getOrDefault(key, 0); 9 } finally { lock.unlockRead(stamp); } 10 } 11 12 int getCount() { 13 long stamp = lock.readLock(); // ← first readLock 14 try { 15 int total = 0; 16 for (String k : data.keySet()) { 17 total += getOrLoad(k); // ← StampedLock NOT reentrant 18 } // deadlocks under writer 19 return total; 20 } finally { lock.unlockRead(stamp); } 21 } 22 }

1 // Fix #1, use ReentrantReadWriteLock (reentrant by design) 2 class Cache { 3 private final ReentrantReadWriteLock lock = new ReentrantReadWriteLock(); 4 private final Map<String, Integer> data = new HashMap<>(); 5 6 int getOrLoad(String key) { 7 lock.readLock().lock(); 8 try { 9 return data.getOrDefault(key, 0); 10 } finally { lock.readLock().unlock(); } 11 } 12 13 int getCount() { 14 lock.readLock().lock(); 15 try { 16 int total = 0; 17 for (String k : data.keySet()) { 18 total += getOrLoad(k); // OK, reentrant 19 } 20 return total; 21 } finally { lock.readLock().unlock(); } 22 } 23 } 24 25 // Fix #2, refactor with internal "already-locked" helpers 26 class Cache { 27 private final StampedLock lock = new StampedLock(); 28 private final Map<String, Integer> data = new HashMap<>(); 29 30 private int getOrLoadLocked(String key) { // caller must hold readLock 31 return data.getOrDefault(key, 0); 32 } 33 34 int getCount() { 35 long stamp = lock.readLock(); 36 try { 37 int total = 0; 38 for (String k : data.keySet()) total += getOrLoadLocked(k); 39 return total; 40 } finally { lock.unlockRead(stamp); } 41 } 42 }