Distributed Lock: Redis or ZooKeeper

What it is

A distributed lock provides mutual exclusion across processes, hosts, or even data centres. Same idea as a thread mutex; harder to implement because the participants don't share memory or even the same machine.

The use cases: scheduled jobs that should only run once across a fleet (cron); rate limiters whose state lives in shared storage; long-running batch operations that shouldn't overlap; leader election (which is mutual exclusion among potential leaders).

The simple Redis recipe

SET key value NX EX seconds is the one-liner. Atomic. Returns OK if it set the key, nil if the key already existed. The value is a unique token (a UUID); the EX is the TTL.

To release: check the token first (Lua script), only delete if it matches.

This is enough for most "I want only one of these to run at a time" cases. It is fast (one round trip), simple, well-understood.

The TTL trade-off

A TTL is required because the holder might crash. Without it, a crashed holder blocks forever.

A too-short TTL is also dangerous because the work might exceed it. If a job takes 30s and the TTL is 10s, the lock expires mid-work, someone else takes it, two of them run concurrently.

The fix: pick TTL slightly longer than expected work, OR keep work short, OR add a heartbeat that extends the lease as long as the holder is still working. The heartbeat approach is robust to slow work but adds complexity (background scheduler, careful ordering on release).

The "zombie holder" failure mode

This is the bug that makes distributed locking subtle. It comes from the same shape every time:

Process A acquires the lock with a TTL of 15 seconds.
Process A starts work.
Something pauses Process A for longer than the TTL: a 30-second GC pause, a paused VM, a network partition, a long syscall.
The lock TTL expires while A is paused.
Process B acquires the lock (it looks free now).
Process A wakes up, still believes it holds the lock, and writes to the resource.
Process B also writes to the resource.

Both "lock holders" wrote. Data corruption. A is the zombie holder; it thinks it's alive, but the lock is already gone.

real time --------------------------------------------------------->

Process A:  acquire -> work -> PAUSED (30s GC) --------> resumes -> WRITE  <-- wrong!
                          |                                  |
                          | lock expires at T+15             | A still thinks it holds
                          | (TTL was 15s)                    | the lock, writes anyway
                          v                                  
Lock TTL:   ----------> (expires)
                                                             
Process B:                            acquire -> WRITE -> release
                                                  ^
                                                  +-- B holds it legitimately
                                                      now A and B both wrote

Fencing tokens fix it

A fencing token is a monotonically increasing number handed out with each lock acquisition. The holder passes the token along with every write to the protected resource. The resource keeps track of the largest token it has ever seen and rejects writes with smaller tokens.

Acquisitions get increasing tokens:

  Process A acquires lock      >  fencing token = 33
  (A pauses)                                          
  Lock TTL expires             >  (no write happens)
  Process B acquires lock      >  fencing token = 34
  Process B writes (token=34)  >  resource accepts: 34 >= max_seen (was 33)
                                  resource updates max_seen = 34
  Process A resumes
  Process A writes (token=33)  >  resource REJECTS: 33 < max_seen (=34)
                                  zombie write blocked

The resource itself becomes the source of truth. Even if two processes think they hold the lock, only one of them can actually mutate state.

ZooKeeper and etcd provide fencing tokens naturally (sequence numbers on ephemeral nodes, modification revisions). Redis does not; fencing on Redis requires managing the monotonic counter manually, with care.

ZooKeeper / etcd / Consul

Stronger guarantees, more operational cost.

ZK ephemeral sequential nodes are the textbook distributed lock: the smallest sequence number holds the lock; others watch the next-smallest. On holder disconnect, the ephemeral node is auto-deleted; the next-smallest is notified. Fencing tokens are the sequence numbers themselves.

etcd's lease-based primitives are similar. Consul's sessions and locks are equivalent.

The trade-off: a ZK ensemble (or etcd cluster) must be run, and apps need a client library that handles connection drops, watch re-establishment, etc. For services that already run ZK/etcd for other reasons (Kubernetes uses etcd), distributed locks come for free. Otherwise, the operational cost is real.

Choosing

For low-stakes mutual exclusion (only one cron job runs, only one node reports metrics), Redis SETNX with TTL is fine. The cost of double-execution is tolerable.

For high-stakes (only one leader writes, only one process upgrades schema, only one job processes a payment), use ZK/etcd or implement fencing tokens carefully on Redis.

For "I just need scheduling deduplication", consider a database row with a unique constraint instead of a lock. Simpler, idempotent, no expiry to manage.

Follow-up questions

▸Is Redis SETNX really safe?

For most use cases where the cost of double-execution is acceptable, yes. For workloads where double-execution corrupts data (writes to a database that doesn't enforce idempotency), no. The classic Redlock-vs-Martin-Kleppmann debate is exactly this. The pragmatic answer: use Redis SETNX for non-critical mutual exclusion (rate limiters, scheduling); use ZooKeeper/etcd or fencing tokens for safety-critical paths.

▸What is a fencing token?

A monotonically increasing number associated with each lock acquisition. The lock holder includes it on every protected operation; the resource (database, file, etc.) checks it against the largest token it has seen. If the token is smaller, the operation is rejected. Prevents 'zombie holder' (a holder whose lock expired but who still thinks it has it) from corrupting state.

▸Why does TTL matter so much?

Because the holder might crash. Without TTL, the lock is held until manually released. A crashed process never releases. The lock blocks forever. With TTL, a crashed holder's lock auto-expires, and someone else can take over. Trade-off: TTL too short means the work might not finish before the lease expires. Either keep work short or extend the lease via heartbeat.

▸When should I reach for ZooKeeper instead of Redis?

When correctness matters more than simplicity. ZK is built for coordination: linearisable, ephemeral nodes auto-clean on disconnect, sequential nodes give natural fencing tokens. Operationally heavier than Redis (a ZK ensemble must be run). For high-stakes locks (leader election, primary-replica failover), ZK or etcd is the right choice. For lower-stakes (rate limit buckets, scheduled job dedup), Redis is fine.

What it is

The simple Redis recipe

SET key value NX EX seconds is the one-liner. Atomic. Returns OK if it set the key, nil if the key already existed. The value is a unique token (a UUID); the EX is the TTL.

To release: check the token first (Lua script), only delete if it matches.

This is enough for most "I want only one of these to run at a time" cases. It is fast (one round trip), simple, well-understood.

The TTL trade-off

A TTL is required because the holder might crash. Without it, a crashed holder blocks forever.

A too-short TTL is also dangerous because the work might exceed it. If a job takes 30s and the TTL is 10s, the lock expires mid-work, someone else takes it, two of them run concurrently.

The "zombie holder" failure mode

This is the bug that makes distributed locking subtle. It comes from the same shape every time:

Process A acquires the lock with a TTL of 15 seconds.
Process A starts work.
Something pauses Process A for longer than the TTL: a 30-second GC pause, a paused VM, a network partition, a long syscall.
The lock TTL expires while A is paused.
Process B acquires the lock (it looks free now).
Process A wakes up, still believes it holds the lock, and writes to the resource.
Process B also writes to the resource.

Both "lock holders" wrote. Data corruption. A is the zombie holder; it thinks it's alive, but the lock is already gone.

real time --------------------------------------------------------->

Process A:  acquire -> work -> PAUSED (30s GC) --------> resumes -> WRITE  <-- wrong!
                          |                                  |
                          | lock expires at T+15             | A still thinks it holds
                          | (TTL was 15s)                    | the lock, writes anyway
                          v                                  
Lock TTL:   ----------> (expires)
                                                             
Process B:                            acquire -> WRITE -> release
                                                  ^
                                                  +-- B holds it legitimately
                                                      now A and B both wrote

Fencing tokens fix it

Acquisitions get increasing tokens:

  Process A acquires lock      >  fencing token = 33
  (A pauses)                                          
  Lock TTL expires             >  (no write happens)
  Process B acquires lock      >  fencing token = 34
  Process B writes (token=34)  >  resource accepts: 34 >= max_seen (was 33)
                                  resource updates max_seen = 34
  Process A resumes
  Process A writes (token=33)  >  resource REJECTS: 33 < max_seen (=34)
                                  zombie write blocked

The resource itself becomes the source of truth. Even if two processes think they hold the lock, only one of them can actually mutate state.

ZooKeeper / etcd / Consul

Stronger guarantees, more operational cost.

etcd's lease-based primitives are similar. Consul's sessions and locks are equivalent.

Choosing

For low-stakes mutual exclusion (only one cron job runs, only one node reports metrics), Redis SETNX with TTL is fine. The cost of double-execution is tolerable.

For high-stakes (only one leader writes, only one process upgrades schema, only one job processes a payment), use ZK/etcd or implement fencing tokens carefully on Redis.

For "I just need scheduling deduplication", consider a database row with a unique constraint instead of a lock. Simpler, idempotent, no expiry to manage.

Follow-up questions

▸Is Redis SETNX really safe?

▸What is a fencing token?

▸Why does TTL matter so much?

▸When should I reach for ZooKeeper instead of Redis?

What it is

The simple Redis recipe

The TTL trade-off

The "zombie holder" failure mode

Fencing tokens fix it

ZooKeeper / etcd / Consul

Choosing

Implementations

Key points

Follow-up questions

Gotchas

Related reading

Distributed Lock: Redis or ZooKeeper

What it is

The simple Redis recipe

The TTL trade-off

The "zombie holder" failure mode

Fencing tokens fix it

ZooKeeper / etcd / Consul

Choosing

Implementations

Key points

Follow-up questions

Gotchas

Related reading