Distributed Lock: Redis or ZooKeeper
Mutual exclusion across processes/machines. Three implementations: SETNX with TTL on Redis (cheap, mostly correct), Redlock (Redis cluster, controversial safety), ZooKeeper ephemeral nodes (strongly correct, more ops). All require a TTL/lease so a crashed holder doesn't block forever. Always design for: 'what if the holder thinks it has the lock but actually doesn't?'
What it is
A distributed lock provides mutual exclusion across processes, hosts, or even data centres. Same idea as a thread mutex; harder to implement because the participants don't share memory or even the same machine.
The use cases: scheduled jobs that should only run once across a fleet (cron); rate limiters whose state lives in shared storage; long-running batch operations that shouldn't overlap; leader election (which is mutual exclusion among potential leaders).
The simple Redis recipe
SET key value NX EX seconds is the one-liner. Atomic. Returns OK if it set the key, nil if the key already existed. The value is a unique token (a UUID); the EX is the TTL.
To release: check the token first (Lua script), only delete if it matches.
This is enough for most "I want only one of these to run at a time" cases. It is fast (one round trip), simple, well-understood.
The TTL trade-off
A TTL is required because the holder might crash. Without it, a crashed holder blocks forever.
A too-short TTL is also dangerous because the work might exceed it. If a job takes 30s and the TTL is 10s, the lock expires mid-work, someone else takes it, two of them run concurrently.
The fix: pick TTL slightly longer than expected work, OR keep work short, OR add a heartbeat that extends the lease as long as the holder is still working. The heartbeat approach is robust to slow work but adds complexity (background scheduler, careful ordering on release).
The "zombie holder" failure mode
This is the bug that makes distributed locking subtle. It comes from the same shape every time:
- Process A acquires the lock with a TTL of 15 seconds.
- Process A starts work.
- Something pauses Process A for longer than the TTL: a 30-second GC pause, a paused VM, a network partition, a long syscall.
- The lock TTL expires while A is paused.
- Process B acquires the lock (it looks free now).
- Process A wakes up, still believes it holds the lock, and writes to the resource.
- Process B also writes to the resource.
Both "lock holders" wrote. Data corruption. A is the zombie holder; it thinks it's alive, but the lock is already gone.
real time --------------------------------------------------------->
Process A: acquire -> work -> PAUSED (30s GC) --------> resumes -> WRITE <-- wrong!
| |
| lock expires at T+15 | A still thinks it holds
| (TTL was 15s) | the lock, writes anyway
v
Lock TTL: ----------> (expires)
Process B: acquire -> WRITE -> release
^
+-- B holds it legitimately
now A and B both wrote
Fencing tokens fix it
A fencing token is a monotonically increasing number handed out with each lock acquisition. The holder passes the token along with every write to the protected resource. The resource keeps track of the largest token it has ever seen and rejects writes with smaller tokens.
Acquisitions get increasing tokens:
Process A acquires lock > fencing token = 33
(A pauses)
Lock TTL expires > (no write happens)
Process B acquires lock > fencing token = 34
Process B writes (token=34) > resource accepts: 34 >= max_seen (was 33)
resource updates max_seen = 34
Process A resumes
Process A writes (token=33) > resource REJECTS: 33 < max_seen (=34)
zombie write blocked
The resource itself becomes the source of truth. Even if two processes think they hold the lock, only one of them can actually mutate state.
ZooKeeper and etcd provide fencing tokens naturally (sequence numbers on ephemeral nodes, modification revisions). Redis does not; fencing on Redis requires managing the monotonic counter manually, with care.
ZooKeeper / etcd / Consul
Stronger guarantees, more operational cost.
ZK ephemeral sequential nodes are the textbook distributed lock: the smallest sequence number holds the lock; others watch the next-smallest. On holder disconnect, the ephemeral node is auto-deleted; the next-smallest is notified. Fencing tokens are the sequence numbers themselves.
etcd's lease-based primitives are similar. Consul's sessions and locks are equivalent.
The trade-off: a ZK ensemble (or etcd cluster) must be run, and apps need a client library that handles connection drops, watch re-establishment, etc. For services that already run ZK/etcd for other reasons (Kubernetes uses etcd), distributed locks come for free. Otherwise, the operational cost is real.
Choosing
For low-stakes mutual exclusion (only one cron job runs, only one node reports metrics), Redis SETNX with TTL is fine. The cost of double-execution is tolerable.
For high-stakes (only one leader writes, only one process upgrades schema, only one job processes a payment), use ZK/etcd or implement fencing tokens carefully on Redis.
For "I just need scheduling deduplication", consider a database row with a unique constraint instead of a lock. Simpler, idempotent, no expiry to manage.
Implementations
Long-running work that exceeds the lock's TTL is dangerous: the lock expires, someone else takes it, both holders run concurrently. Either keep the work short, or extend the lease periodically.
1 public boolean runWithLock(String key, Duration leaseTtl, Runnable work) {
2 String token = UUID.randomUUID().toString();
3 if (!redis.set(key, token, "NX", "PX", leaseTtl.toMillis())) {
4 return false;
5 }
6
7 // Heartbeat: every leaseTtl/3, extend the lease
8 ScheduledFuture<?> heartbeat = scheduler.scheduleAtFixedRate(() -> {
9 // EXTEND_LUA_SHA must be the SHA1 returned by SCRIPT LOAD; or use
10 // redis.eval(EXTEND_LUA_SOURCE, ...) on first call and cache the SHA.
11 redis.evalsha(EXTEND_LUA_SHA, List.of(key),
12 List.of(token, String.valueOf(leaseTtl.toMillis())));
13 }, leaseTtl.toMillis() / 3, leaseTtl.toMillis() / 3, TimeUnit.MILLISECONDS);
14
15 try {
16 work.run();
17 return true;
18 } finally {
19 heartbeat.cancel(true);
20 releaseSafely(key, token);
21 }
22 }
23
24 // EXTEND_LUA: only extend if we still hold it
25 // if redis.call("get", KEYS[1]) == ARGV[1] then
26 // return redis.call("pexpire", KEYS[1], ARGV[2])
27 // else return 0 endKey points
- •Redis SET key value NX EX ttl is the simplest distributed lock. Atomic, TTL-based, fast.
- •Lock holder must store a unique token (UUID) and check it on release; otherwise release-by-someone-else is possible.
- •TTL is required: if the holder crashes, the lock auto-expires. Pick TTL > expected critical-section duration.
- •Fencing tokens (monotonic ID) protect against zombies: 'I had the lock, then GC paused me, lock expired, someone else took it, then I came back and acted on stale belief'.
- •ZooKeeper / etcd / Consul give stronger guarantees (linearisable, sequential ephemeral nodes) at higher operational cost.
Follow-up questions
▸Is Redis SETNX really safe?
▸What is a fencing token?
▸Why does TTL matter so much?
▸When should I reach for ZooKeeper instead of Redis?
Gotchas
- !No TTL: crashed holder blocks the lock forever
- !No token check on release: someone else's release deletes the lock
- !Long-running work without lease extension: TTL expires mid-work, two holders concurrent
- !Treating SETNX as linearisable: it is not under all failure modes
- !No fencing token in critical paths: zombies can corrupt data after lock expiry