Idempotency

What it is

An operation is idempotent if calling it many times has the same effect as calling it once. set x = 5 is idempotent. x++ is not. Deleting user 123 is idempotent (after the first call, the user is gone, and any number of further deletes leave them gone). Creating a new charge is not (each call charges the card again).

Idempotency matters because in any networked system there is no reliable way to tell apart two failure modes: "my request never reached the server" and "my request reached the server but the response got lost on the way back". From the client's side they look identical. The only safe reaction is to retry. And if the operation on the other end is not idempotent, that retry quietly does the work a second time: two charges, two emails sent, two new user rows.

Why retries happen at all

A networked call goes through many hops before it returns. Any of them can drop a packet:

The client's connection to the server times out.
The load balancer or proxy in between gives up.
The server processes the request fine but its response packet is lost.
A worker queue redelivers a message because it never saw the worker's "done" acknowledgement.

In every one of these cases, the system's only options are "give up" or "try again". Almost every real system is built to try again. So the operation on the other end has to be ready to be called more than once for the same logical request, and behave the same as if it had been called once.

A picture of the bug

Without any deduplication, a network glitch on the response leg turns one customer click into two charges. Here is what that looks like:

The server has no way to know the second request is the same logical operation as the first. From its side it looks like two separate clicks.

The fix

The client generates a unique ID for each logical operation, called an idempotency key, and sends it with every attempt of that operation. On a retry, the same key goes again. The server keeps a small table of key → response pairs for some retention window. When a request arrives, it looks up the key first.

Two important properties:

The key is generated once per logical operation, not per HTTP attempt. Every retry of the same charge sends the same key. Otherwise the server has nothing to deduplicate on.
The server stores the response, not just the fact that the key was seen. On retry, the client gets back the same answer it would have gotten the first time. Without the stored response, the retry might succeed but the client never learns the charge ID.

Operations that are idempotent for free

Some operations are safe to retry without any key, because doing the work twice has the same effect as doing it once:

Reading something (the HTTP GET verb). Reading the same resource N times leaves no trace.
Replacing a value (the HTTP PUT verb). "Set the user's email to alice@x.com" is the same after one call or ten.
Deleting a resource (the HTTP DELETE verb). After the first delete, further deletes find nothing to remove.
Setting a value to a constant in any setting. x = 5 is the same a hundred times in a row.

The HTTP spec marks these as idempotent for exactly this reason. Most well-designed APIs follow.

Operations that need keys

These are the ones where each call has its own real-world side effect:

Creating something new (HTTP POST to a collection). Each call adds a new row.
Charging a card, sending an email, publishing an event, calling someone's phone. Each call is a separate event in the world.
Database inserts, counter increments, log appends, queue pushes. Each call adds.

For these, you need the idempotency key approach. Without it, the first network blip in production produces a duplicate-charge incident.

What the server has to do

The server-side recipe is short. On every incoming request that carries a key:

Look up the key. If it is already in the store, return the saved response and stop. No work happens.
Acquire a short lock on the key so two retries that arrive at the same time do not both pass step 1 and both do the work.
Run the actual operation.
Save the response under the key, with a TTL.
Release the lock and return the response.

Step 2 is the part most people miss. Without the lock, two retries can race past the "have I seen this key?" check before either of them has saved a response, and both end up doing the work. The lock serialises them so only one runs and the other waits and reads the cached answer.

Stripe's API documents this pattern publicly; most production payment systems implement something close to it.

How long to keep the keys

Keys are stored with a time-to-live. Pick it to cover the longest realistic retry window:

A day or so for synchronous APIs, where clients retry within seconds or minutes.
A week or more for async or queue-driven workflows, where a job might be redelivered hours or days later.

The storage cost is small. A key plus its cached response is on the order of a few hundred bytes. This is never the bottleneck.

The short version

Use an idempotency key on every state-changing call that might be retried: charges, message sends, resource creation. Generate one key per logical operation and reuse it on every retry; never a fresh one per HTTP attempt. Lean on HTTP semantics where you can: PUT and DELETE are usually idempotent without any extra work, so the key infrastructure is only needed for POST and operations that look like a POST.

Get this right and retries are boring. Skip it and the first lost response in production becomes an incident.

What it is

Why retries happen at all

A networked call goes through many hops before it returns. Any of them can drop a packet:

The client's connection to the server times out.
The load balancer or proxy in between gives up.
The server processes the request fine but its response packet is lost.
A worker queue redelivers a message because it never saw the worker's "done" acknowledgement.

A picture of the bug

Without any deduplication, a network glitch on the response leg turns one customer click into two charges. Here is what that looks like:

The server has no way to know the second request is the same logical operation as the first. From its side it looks like two separate clicks.

The fix

Two important properties:

The key is generated once per logical operation, not per HTTP attempt. Every retry of the same charge sends the same key. Otherwise the server has nothing to deduplicate on.
The server stores the response, not just the fact that the key was seen. On retry, the client gets back the same answer it would have gotten the first time. Without the stored response, the retry might succeed but the client never learns the charge ID.

Operations that are idempotent for free

Some operations are safe to retry without any key, because doing the work twice has the same effect as doing it once:

Reading something (the HTTP GET verb). Reading the same resource N times leaves no trace.
Replacing a value (the HTTP PUT verb). "Set the user's email to alice@x.com" is the same after one call or ten.
Deleting a resource (the HTTP DELETE verb). After the first delete, further deletes find nothing to remove.
Setting a value to a constant in any setting. x = 5 is the same a hundred times in a row.

The HTTP spec marks these as idempotent for exactly this reason. Most well-designed APIs follow.

Operations that need keys

These are the ones where each call has its own real-world side effect:

Creating something new (HTTP POST to a collection). Each call adds a new row.
Charging a card, sending an email, publishing an event, calling someone's phone. Each call is a separate event in the world.
Database inserts, counter increments, log appends, queue pushes. Each call adds.

For these, you need the idempotency key approach. Without it, the first network blip in production produces a duplicate-charge incident.

What the server has to do

The server-side recipe is short. On every incoming request that carries a key:

Look up the key. If it is already in the store, return the saved response and stop. No work happens.
Acquire a short lock on the key so two retries that arrive at the same time do not both pass step 1 and both do the work.
Run the actual operation.
Save the response under the key, with a TTL.
Release the lock and return the response.

Stripe's API documents this pattern publicly; most production payment systems implement something close to it.

How long to keep the keys

Keys are stored with a time-to-live. Pick it to cover the longest realistic retry window:

A day or so for synchronous APIs, where clients retry within seconds or minutes.
A week or more for async or queue-driven workflows, where a job might be redelivered hours or days later.

The storage cost is small. A key plus its cached response is on the order of a few hundred bytes. This is never the bottleneck.

The short version

Get this right and retries are boring. Skip it and the first lost response in production becomes an incident.

What it is

Why retries happen at all

A picture of the bug

The fix

Operations that are idempotent for free

Operations that need keys

What the server has to do

How long to keep the keys

The short version

Implementations

Key points

Follow-up questions

Gotchas

Related reading

Idempotency

What it is

Why retries happen at all

A picture of the bug

The fix

Operations that are idempotent for free

Operations that need keys

What the server has to do

How long to keep the keys

The short version

Implementations

Key points

Follow-up questions

Gotchas

Related reading