Idempotency Key: Stripe-Style Implementation
Server-side dedup of retries on non-idempotent operations. Client sends Idempotency-Key header (unique per logical operation). Server: lock on key → check cache → if cached, replay response → else process and cache → release. TTL on cache (24h typical). Detect key collisions via request fingerprint.
What it is
An idempotency-key implementation makes non-idempotent operations safe to retry. The contract: the client sends an Idempotency-Key header (a UUID per logical operation). The server, on first receipt, processes the request and caches the response keyed by the idempotency key. On any subsequent retry with the same key, the server replays the cached response without re-doing the work.
This is the pattern Stripe uses (and documented in detail), and now standard across most modern payment, messaging, and provisioning APIs.
Why this matters
Without idempotency, every network blip during a non-idempotent call risks a duplicate side effect. Customer's network drops; client retries; customer is charged twice. Multiply by the millions of payments per day across the industry, and the failure mode is real money.
With idempotency, a retry of an already-processed request is a no-op (returns the cached response). The client can retry aggressively without fear of duplicate side effects.
The server-side flow
For each request with an Idempotency-Key:
- Compute fingerprint of the request body (hash). Used to detect key collisions.
- Check cache. If a cached response exists for this key:
- If fingerprint matches: replay the cached response. Return.
- If fingerprint differs: 422 Unprocessable Entity ("idempotency key reused with different body").
- Acquire lock on the key. If locked by another request: 409 Conflict ("request in flight").
- Process the request through the normal handler.
- Store the response (body + status + relevant headers + fingerprint) keyed by the idempotency key, with TTL.
- Release the lock.
- Return the response.
Steps 2 and 3 are why concurrent retries don't both execute. Step 1 protects against accidental key reuse for different operations.
The client side
Three rules:
- One key per logical operation, not per HTTP attempt. Generate the key once when the user clicks "submit", reuse it on every retry.
- Persist the key for the duration of retries. If the client process crashes mid-retry, the next attempt should use the same key. Local storage, sessionStorage, server-side draft state, whatever fits the platform.
- Send the key on every state-changing request that might retry. POST, PATCH, sometimes PUT. GET is naturally idempotent and doesn't need it.
Cache TTL
How long to keep cached responses?
The lower bound: longer than the maximum retry window. If clients can retry over 24 hours (long-poll, async workflow), TTL must be 24h+.
The upper bound: not too long, or storage grows. 24 hours to 7 days is typical. After that, the same key would be treated as a new request; this is acceptable because nobody is realistically retrying after a week.
Storage and scale
Per-key storage: a few hundred bytes (response body, status, fingerprint, metadata). For a service doing 1000 req/sec with 24h TTL, that's 1000 * 86400 = 86M keys, ~50GB of Redis. Manageable. For higher scale, shard or compress; or use a database with TTL semantics.
Edge cases
The first request fails after the side effect. Worker charged the card; about to write the response to cache; process killed. Lock has TTL, will expire. Client retries; new processor takes the lock; runs the work again; charges the card again. Fix: write the cache before unlocking, OR make the work itself idempotent (handler checks "did I already process this?" via a separate processed-records table).
The cached response has gone stale. Possible if the cache contains a body referencing data that has since changed. Usually fine for create-style operations (the cached response is the resource just created). Tricky for query-style operations (use shorter TTL).
Different requests with the same key (collision). Detect via fingerprint. Reject with 422. Better than silently returning the wrong response.
Implementations
Wrap the controller in a filter. Same logic: check cache, lock, process, store. The lock ensures concurrent retries with the same key serialise; the cache ensures completed work isn't re-done.
1 @Component
2 public class IdempotencyFilter implements Filter {
3 private final RedisTemplate<String, String> redis;
4 private final ObjectMapper json;
5
6 @Override
7 public void doFilter(ServletRequest req, ServletResponse res, FilterChain chain) {
8 HttpServletRequest hreq = (HttpServletRequest) req;
9 String key = hreq.getHeader("Idempotency-Key");
10 if (key == null || hreq.getMethod().equals("GET")) {
11 chain.doFilter(req, res);
12 return;
13 }
14
15 String fp = hashBody(hreq);
16 String cacheKey = "idem:" + key;
17
18 String cached = redis.opsForValue().get(cacheKey);
19 if (cached != null) {
20 CachedResponse cr = json.readValue(cached, CachedResponse.class);
21 if (!cr.fingerprint.equals(fp)) {
22 respond(res, 422, "key reused with different body");
23 return;
24 }
25 respond(res, cr.status, cr.body);
26 return;
27 }
28
29 // Lock + process + store + release
30 if (!redis.opsForValue().setIfAbsent("idem:lock:" + key, "1", Duration.ofSeconds(60))) {
31 respond(res, 409, "in flight");
32 return;
33 }
34 try {
35 CachingResponseWrapper wrapper = new CachingResponseWrapper((HttpServletResponse) res);
36 chain.doFilter(req, wrapper);
37 CachedResponse cr = new CachedResponse(wrapper.getStatus(), wrapper.getBody(), fp);
38 redis.opsForValue().set(cacheKey, json.writeValueAsString(cr), Duration.ofHours(24));
39 } finally {
40 redis.delete("idem:lock:" + key);
41 }
42 }
43 }Key points
- •Client generates ONE key per logical operation, reuses on retry. NOT one per HTTP attempt.
- •Server-side: lock on key (so concurrent retries don't both run), check cache, process, store response, release.
- •Cache stores: response body + status code + relevant headers. Replay must be byte-identical.
- •TTL: typically 24h. Long enough for client retries, short enough that storage doesn't grow forever.
- •Detect key collisions: store a fingerprint (hash of request body) and compare. If different, return 422 instead of cached response.
Follow-up questions
▸What happens with concurrent requests using the same idempotency key?
▸Why detect key collisions?
▸Should the cache be local or shared?
▸What about idempotency for fire-and-forget background jobs?
Gotchas
- !Generating new key per HTTP attempt: server sees each retry as a fresh request
- !Caching only response body, not status code: replay returns wrong status
- !No collision detection: same key for different operations replays wrong response
- !No TTL on cached response: storage grows without bound
- !Lock without TTL: crashed processor blocks subsequent retries forever