System DesignApril 5, 2026· 94 min read

System Design: E-Commerce Flash Sales (10M Users, Coupon System, One-Per-User Enforcement)

Goal: Design a flash sale platform that handles 10 million concurrent users competing for 100K discounted items and 500K limited coupons, with zero overselling, one-coupon-per-user enforcement, and graceful degradation under extreme load.

Scale: 10M concurrent users, 100K inventory items, 500K coupon pool, 50K orders/min at peak, sub-200ms checkout latency.

Mental model -- four ideas that make everything else click:

Inventory = atomic counter. Decrement succeeds or fails in a single Valkey Lua eval. No read-then-write race.

Coupon pool = pre-loaded list. LPOP is atomic. No two users get the same code.

Queue = admission controller. 10M users enter, 1000 per batch reach backend.

Checkout = saga. Reserve, apply, charge, confirm. Any failure triggers reverse.

TL;DR: Valkey (Redis-compatible) provides the atomic primitives for inventory decrement, coupon pool claims, and user deduplication. PostgreSQL serves as the durable ledger. Kafka decouples order processing. A CDN-served virtual queue shapes traffic before it reaches backend services. Temporal orchestrates the checkout saga with compensating transactions for coupon rollback on payment failure.

System invariant: The queue shapes traffic. Valkey provides atomics. PostgreSQL provides durability. Kafka decouples processing. Temporal orchestrates compensation. Each can fail independently without cascading into total outage.

The Three Problems

Every design decision in this system comes from just three constraints.

Inventory overselling. 50,000 users click "Buy Now" on the same 500-unit item within 200 milliseconds of T-0. A naive SELECT quantity ... UPDATE quantity = quantity - 1 will oversell: two threads read quantity=1, both decrement, and the system ships inventory it does not have.

Coupon double-claiming. 500,000 coupons, one per user. A motivated user opens five tabs or scripts a bot. Without atomic uniqueness enforcement, a single user claims dozens of coupons before the system processes the first claim.

Traffic spike at sale start. 10 million users hit the origin at T-0. Database connection pools exhaust. Valkey pipelines back up. The checkout service returns 503s. Users rage-refresh, amplifying the problem. Without traffic shaping, the system DDoSes itself.

Scale Numbers

Metric	Value
Concurrent users at T-0	10,000,000
Total inventory items (across all SKUs)	100,000
Coupon pool size	500,000
Peak orders per minute	50,000
Peak checkout requests per second	~5,000
Sale duration	2 hours
Expected total orders	~200,000
Unique SKUs on sale	500
Average units per SKU	200
Coupon types	4 (percent-off, fixed, BOGO, free shipping)

Requirements

Functional Requirements

#	Requirement	Description
FR-1	Flash Sale Creation	Admin creates a sale with start time, end time, list of SKUs, and inventory per SKU
FR-2	Inventory Management	System tracks available inventory per SKU with atomic decrement on purchase
FR-3	Sale Countdown	Users see a real-time countdown to sale start; page activates at T-0
FR-4	Virtual Queue	Users entering at T-0 are placed in a virtual queue with estimated wait time
FR-5	Add to Cart	User selects a sale item and adds it to their flash-sale cart (temporary hold)
FR-6	Coupon Creation	Admin creates coupon campaigns with type (percent-off, fixed, BOGO, free shipping), pool size, and rules
FR-7	Coupon Claiming	User claims a coupon from an available pool; claimed coupons are deducted atomically
FR-8	One-Per-User Enforcement	Each user can claim at most one coupon per campaign, enforced across all devices and sessions
FR-9	Coupon Application	User applies a claimed coupon at checkout; system validates eligibility and calculates discount
FR-10	Coupon Stacking Rules	System enforces which coupon types can combine, priority order, and maximum discount caps
FR-11	Checkout	User completes purchase: inventory reserved, coupon applied, payment processed
FR-12	Payment Processing	Async payment via saga pattern; inventory and coupon released on failure
FR-13	Order Tracking	User sees order status (pending, confirmed, shipped, delivered)
FR-14	Sale Analytics	Real-time dashboard showing inventory levels, orders/sec, coupon redemption rates
FR-15	Notification	User receives confirmation via email/push when order is confirmed or coupon is about to expire
FR-16	Idempotent Checkout	Duplicate checkout requests (retries, double-clicks, network retries) must not create duplicate orders
FR-17	Reservation Timeout	If payment is not completed within 10 minutes of inventory reservation, the reservation is released automatically

Non-Functional Requirements

Requirement	SLO / Target	Rationale
Throughput	50,000 orders/min at peak	Derived from 10M users, ~2% conversion, concentrated in first 15 minutes
Checkout Latency (p99)	< 200ms	Users abandon after 3 seconds; checkout must be fast to prevent retries
Inventory Accuracy	Zero overselling	Legal and financial liability; refunds cost 5x the item margin
Coupon Uniqueness	Zero double-claims per user	One coupon per user per campaign, no exceptions
Availability	99.95% during sale window	2-hour sale; even 0.05% downtime = 3.6 seconds of lost orders
Queue Fairness	FIFO within 1-second cohorts	Users arriving at the same second get randomized within that cohort
Data Durability	Zero order loss	Every confirmed order must be durable within 100ms of confirmation
Coupon Rollback Latency	< 5 seconds	Failed payment must return coupon to pool quickly so others can claim
CDN Cache Hit Rate	> 99% for sale page	Origin must not serve the static sale page to 10M users
Horizontal Scalability	Linear up to 20M users	Architecture must scale by adding nodes, not by vertical scaling
Idempotency	All write operations idempotent under retries	Network failures and user retries are guaranteed during flash sale chaos
Graceful Degradation	System sheds load when downstream services degrade	Payment gateway slowdown must not cascade into inventory or coupon failures
Consistency Model	Strong consistency for inventory and coupons; eventual consistency for analytics	Overselling and double-claiming are correctness violations; reporting lag is acceptable

Scale Estimation

These requirements define what the system must do. The scale numbers below reveal how hard each requirement becomes at 10M concurrent users.

Traffic Estimates

Metric	Calculation	Result
Users at T-0	Given	10,000,000
Page views in first minute	10M users x 3 refreshes	30,000,000
CDN requests/sec (first minute)	30M / 60	500,000 req/sec
Origin requests/sec (after queue)	5,000 admitted/sec	5,000 req/sec
Checkout requests/sec (peak)	50K orders/min / 60	~833 req/sec
Valkey ops/sec (inventory)	833 checkouts x 3 ops each	~2,500 ops/sec
Valkey ops/sec (coupons)	833 checkouts x 4 ops each	~3,300 ops/sec
Valkey ops/sec (queue)	5,000 admits + 10,000 status polls	~15,000 ops/sec
Total Valkey ops/sec	Inventory + coupons + queue + cache	~25,000 ops/sec

Storage Estimates

Data	Calculation	Size
Inventory records (Valkey)	500 SKUs x 100 bytes	50 KB
Coupon pool (Valkey list)	500K coupons x 64 bytes	32 MB
User claim records (Valkey set)	500K claims x 80 bytes	40 MB
Queue tokens (Valkey sorted set)	10M tokens x 100 bytes	1 GB
Orders (PostgreSQL)	200K orders x 2 KB	400 MB
Coupon claims (PostgreSQL)	500K claims x 200 bytes	100 MB
Kafka events	200K orders x 5 events x 1 KB	1 GB

Valkey Cluster Sizing

Requirement	Sizing
Total memory needed	~1.5 GB (with overhead)
Ops/sec needed	~25,000
Single Valkey node capacity	~200,000 ops/sec, 64 GB RAM
Minimum nodes for HA	3 primaries + 3 replicas
Recommended	3 primaries + 3 replicas (massive headroom)

The bottleneck is not Valkey throughput or memory. It is hot-key contention on popular SKUs. A single SKU key will be accessed from all Valkey clients. The inventory sharding strategy in the Atomic Inventory Control section addresses this.

PostgreSQL Sizing

Metric	Value
Peak write rate	~833 orders/sec + ~833 coupon claims/sec
Connection pool size	200 connections (pgbouncer)
Write latency (p99)	< 10ms
Required IOPS	~5,000
Instance type	db.r6g.2xlarge (8 vCPU, 64 GB)

Why Naive Approaches Fail

The scale estimates above expose why textbook solutions collapse under flash sale load. Several natural approaches fail under these constraints.

Database row locks for inventory. PostgreSQL row locks under 50K concurrent writes per second cause lock contention, connection pool exhaustion, and cascading timeouts. A single row lock becomes a serialization bottleneck -- every checkout queues behind the previous one.

Application-level coupon uniqueness checks. In-memory sets without atomic operations will race. Two requests checking if user not in claimed_set simultaneously will both pass. By the time the second request writes to the set, the first has already claimed a code. The user ends up with two coupons.

Serving the sale page from origin. 10M users hitting Next.js SSR at T-0 will overwhelm compute. The sale landing page must be CDN-cached and static. Even the waiting room page must come from CDN -- if the queue page itself requires origin, the problem has already cascaded.

Eventual consistency for inventory counts. "Reconcile later" means overselling now. Inventory decrement must be atomic and strongly consistent at the moment of reservation, not eventually consistent after a replication lag.

Skipping the virtual queue. Letting all 10M users through to the checkout service simultaneously turns a controlled sale into a DDoS on internal infrastructure. Without admission control, the system fails under its own traffic.

Synchronous payment processing. Payment gateways have variable latency (200ms to 5s). Blocking on payment while holding inventory locks wastes capacity and creates cascading timeouts across the entire checkout path.

Ignoring coupon rollback. If payment fails after a coupon is claimed, that coupon must be returned to the pool. Otherwise coupons leak -- users see "sold out" for coupons that were never actually used.

Treating all SKUs equally. A doorbuster item with 50 units will have 100x the contention of a standard sale item with 5,000 units. Hot-key mitigation is essential for the most popular SKUs.

Allowing unconstrained coupon stacking. Without server-side stacking rules, users will combine a 50% off coupon with a BOGO deal and a free shipping code, buying a $200 item for $0. Stacking rules must be enforced atomically at checkout.

Architecture Overview

These failure modes shape every design decision. The architecture must absorb 500K CDN requests/sec at T-0, funnel 10M users through the virtual queue, and sustain 833 checkout requests/sec while maintaining exactly-once inventory semantics.

The architecture separates the hot path (Valkey atomics during the sale) from the warm path (async persistence to PostgreSQL via Kafka) and the cold path (post-sale reconciliation).

Write Path (Checkout Flow)

The write path has two critical nodes. The Virtual Queue is the admission controller -- it prevents 10M users from hitting backend services simultaneously, releasing 1,000 users every 2 seconds. The Temporal Workflow is the saga orchestrator -- it executes reserve-inventory, apply-coupon, charge-payment, confirm-order in sequence, and runs compensating transactions in reverse on failure.

Checkout Saga Path

Each step has a compensating action. If payment fails at step 3, the coupon is returned to the pool and inventory is released. No resource is permanently consumed by a failed checkout.

Read Path (Status and Browsing)

The read path is entirely separate from the write path. Sale item browsing reads from Valkey cache. Order status reads from PostgreSQL. Queue position reads from the Valkey sorted set. No read operation touches the checkout saga or its state.

The system has three independent paths. The write path executes checkouts. The read path serves browsing and status queries. The control path (virtual queue) regulates how many users reach the write path at any given moment. These paths share storage but never share execution.

If only one thing stays in memory from this article: the queue controls how many users enter the system. Valkey handles all critical atomic operations. PostgreSQL is the durable system of record. Kafka decouples writes from persistence. Temporal ensures failures do not leak state. Everything else builds on these five guarantees.

API Design

With the high-level architecture established, the API contract defines every interaction between clients and the system.

Versioning Strategy

All endpoints use URL-path versioning under /api/v1/. This makes the version explicit in every request and simplifies routing at the API Gateway layer. The deprecation policy enforces a 6-month sunset window: when /api/v2/ of an endpoint ships, /api/v1/ continues to function for 6 months with a Sunset response header indicating the retirement date. For experimental endpoints (beta features like VIP queue tiers), the Accept-Version request header selects the experimental variant without polluting the URL namespace.

Sale Endpoints

List Active Sales

GET /api/v1/sales?status=active

Response 200:
{
  "sales": [
    {
      "id": "sale_abc123",
      "name": "Summer Flash Sale 2026",
      "start_time": "2026-07-01T12:00:00Z",
      "end_time": "2026-07-01T14:00:00Z",
      "status": "active",
      "items_count": 500,
      "queue_enabled": true
    }
  ]
}

Get Sale Items

GET /api/v1/sales/{sale_id}/items?cursor=eyJza3UiOiJTS1UtMDIwIn0&limit=20

Response 200:
{
  "items": [
    {
      "sku_id": "SKU-LAPTOP-001",
      "product_name": "UltraBook Pro 14",
      "original_price": 1299.99,
      "sale_price": 649.99,
      "available": true,
      "remaining_pct": "low"
    }
  ],
  "pagination": {
    "cursor": "eyJza3UiOiJTS1UtMDQwIn0",
    "limit": 20,
    "has_more": true
  }
}

The system returns remaining_pct as a bucket ("high", "medium", "low") rather than an exact count. Showing exact counts creates herding behavior where users rush the items with lowest remaining inventory.

Pagination uses cursor-based traversal. Offset-based pagination breaks under concurrent inserts and deletions during a live sale -- items shift between pages. The cursor encodes the last seen sku_id, making each page request stable regardless of concurrent modifications.

Cart Endpoints

Add to Cart

POST /api/v1/cart
Authorization: Bearer {token}
Idempotency-Key: {uuid}

{
  "sale_id": "sale_abc123",
  "sku_id": "SKU-LAPTOP-001",
  "quantity": 1
}

Response 201:
{
  "cart_id": "cart_xyz789",
  "sku_id": "SKU-LAPTOP-001",
  "quantity": 1,
  "sale_price": 649.99,
  "hold_expires_at": "2026-07-01T12:10:00Z",
  "message": "Item held for 10 minutes"
}

Response 409:
{
  "error": "SOLD_OUT",
  "message": "This item is no longer available"
}

Coupon Endpoints

Claim Coupon

POST /api/v1/coupons/claim
Authorization: Bearer {token}
Idempotency-Key: {uuid}

{
  "campaign_id": "camp_summer50"
}

Response 200:
{
  "coupon_code": "SUMMER-A7K2M",
  "campaign_id": "camp_summer50",
  "type": "percent_off",
  "discount_value": 15,
  "max_discount_cap": 100.00,
  "min_cart_value": 50.00,
  "valid_until": "2026-07-01T14:00:00Z"
}

Response 409:
{
  "error": "ALREADY_CLAIMED",
  "message": "You have already claimed a coupon from this campaign"
}

Response 410:
{
  "error": "POOL_EXHAUSTED",
  "message": "All coupons from this campaign have been claimed"
}

Apply Coupon to Cart

POST /api/v1/cart/apply-coupon
Authorization: Bearer {token}
Idempotency-Key: {uuid}

{
  "cart_id": "cart_xyz789",
  "coupon_code": "SUMMER-A7K2M"
}

Response 200:
{
  "cart_id": "cart_xyz789",
  "subtotal": 649.99,
  "coupon_code": "SUMMER-A7K2M",
  "coupon_type": "percent_off",
  "discount_amount": 97.50,
  "shipping_cost": 0.00,
  "total": 552.49,
  "applied_coupons": [
    {
      "code": "SUMMER-A7K2M",
      "type": "percent_off",
      "discount": 97.50,
      "description": "15% off (max $100)"
    }
  ]
}

Response 400:
{
  "error": "MIN_CART_NOT_MET",
  "message": "Cart subtotal must be at least $50.00 to use this coupon"
}

Checkout Endpoints

Initiate Checkout

POST /api/v1/checkout
Authorization: Bearer {token}
Idempotency-Key: {uuid}

{
  "cart_id": "cart_xyz789",
  "payment_method": "credit_card",
  "payment_token": "tok_visa_4242"
}

Response 202:
{
  "order_id": "ord_def456",
  "status": "pending",
  "saga_workflow_id": "wf_checkout_abc",
  "message": "Order is being processed"
}

Get Order Status

GET /api/v1/orders/{order_id}
Authorization: Bearer {token}

Response 200:
{
  "order_id": "ord_def456",
  "status": "confirmed",
  "sku_id": "SKU-LAPTOP-001",
  "quantity": 1,
  "subtotal": 649.99,
  "coupon_discount": 97.50,
  "total": 552.49,
  "payment_status": "captured",
  "created_at": "2026-07-01T12:05:30Z",
  "confirmed_at": "2026-07-01T12:05:32Z"
}

Queue Endpoints

Join Queue

POST /api/v1/queue/join
Authorization: Bearer {token}
Idempotency-Key: {uuid}

{
  "sale_id": "sale_abc123"
}

Response 200:
{
  "queue_token": "qt_abc123xyz",
  "position": 45231,
  "estimated_wait_seconds": 90,
  "websocket_url": "wss://ws.example.com/queue/qt_abc123xyz"
}

Check Queue Status

GET /api/v1/queue/status?token=qt_abc123xyz

Response 200:
{
  "queue_token": "qt_abc123xyz",
  "status": "waiting",
  "position": 12045,
  "estimated_wait_seconds": 24,
  "admitted_so_far": 33186
}

// When admitted:
{
  "queue_token": "qt_abc123xyz",
  "status": "admitted",
  "access_token": "at_secure_xyz",
  "expires_at": "2026-07-01T12:15:00Z",
  "message": "You may now browse and purchase"
}

Error Contract

All error responses follow a structured JSON format with machine-readable codes. Client applications switch on error_code, not on message strings.

json

{
  "error_code": "INVENTORY_EXHAUSTED",
  "message": "SKU-LAPTOP-001 is sold out",
  "details": {
    "sku_id": "SKU-LAPTOP-001",
    "sale_id": "sale_abc123"
  },
  "request_id": "req_abc123",
  "timestamp": "2026-07-01T12:05:30.123Z"
}

Error Code	HTTP Status	Trigger
`INVENTORY_EXHAUSTED`	409	Inventory decrement returns 0 available
`COUPON_ALREADY_CLAIMED`	409	SET NX returns nil (user already claimed)
`SALE_NOT_ACTIVE`	403	Sale status is not 'active' at request time
`QUEUE_POSITION_EXPIRED`	410	Admitted user's access token has expired
`PAYMENT_DECLINED`	402	Payment gateway returns a decline code

Rate Limiting

Per-endpoint rate limits protect backend services from abuse and amplification attacks. Limits are enforced at the API Gateway and communicated via standard headers.

Endpoint	Limit	Window
`POST /queue/join`	3 requests	60 seconds per user
`GET /queue/status`	30 requests	60 seconds per user
`POST /coupons/claim`	5 requests	60 seconds per user
`POST /checkout`	5 requests	60 seconds per user
`GET /sales/{id}/items`	60 requests	60 seconds per user

Every response includes rate limit headers:

X-RateLimit-Limit: 5
X-RateLimit-Remaining: 3
X-RateLimit-Reset: 1719835200

When a limit is exceeded, the system returns HTTP 429 with a Retry-After header indicating the number of seconds until the limit resets.

Authorization

JWT tokens carry scoped permissions. Each endpoint requires a specific scope, enforced at the API Gateway before the request reaches the backend service.

Scope	Grants Access To
`sale:browse`	List sales, get sale items, get queue status
`cart:manage`	Add to cart, remove from cart, view cart
`coupon:claim`	Claim coupon, apply coupon to cart
`checkout:submit`	Initiate checkout, view order status
`admin:sale-manage`	Create/update sales, manage coupon campaigns, view analytics

Tokens issued at queue admission carry sale:browse + cart:manage + coupon:claim + checkout:submit. Admin tokens are issued via a separate OAuth flow with MFA.

Data Model

The API contract reveals what data the system must persist and query. The schema below supports every endpoint above.

PostgreSQL Schemas

Sales Table

sql

CREATE TABLE flash_sales (
    id              UUID PRIMARY KEY DEFAULT gen_random_uuid(),
    name            VARCHAR(255) NOT NULL,
    description     TEXT,
    start_time      TIMESTAMPTZ NOT NULL,
    end_time        TIMESTAMPTZ NOT NULL,
    status          VARCHAR(20) DEFAULT 'scheduled'
                    CHECK (status IN ('scheduled', 'active', 'ended', 'cancelled')),
    max_orders_per_user INT DEFAULT 1,
    queue_enabled   BOOLEAN DEFAULT true,
    queue_batch_size INT DEFAULT 1000,
    queue_interval_ms INT DEFAULT 2000,
    created_at      TIMESTAMPTZ DEFAULT now(),
    updated_at      TIMESTAMPTZ DEFAULT now()
);

-- Filter by status for active sale lookups (most common query)
CREATE INDEX idx_flash_sales_status ON flash_sales(status);
-- Range scan for upcoming sales sorted by start time
CREATE INDEX idx_flash_sales_start_time ON flash_sales(start_time);

Inventory Table

sql

CREATE TABLE sale_inventory (
    id              UUID PRIMARY KEY DEFAULT gen_random_uuid(),
    sale_id         UUID NOT NULL REFERENCES flash_sales(id),
    sku_id          VARCHAR(50) NOT NULL,
    product_name    VARCHAR(255) NOT NULL,
    original_price  DECIMAL(10,2) NOT NULL,
    sale_price      DECIMAL(10,2) NOT NULL,
    total_quantity  INT NOT NULL,
    sold_quantity   INT DEFAULT 0,
    reserved_quantity INT DEFAULT 0,
    status          VARCHAR(20) DEFAULT 'available'
                    CHECK (status IN ('available', 'sold_out', 'disabled')),
    created_at      TIMESTAMPTZ DEFAULT now(),
    updated_at      TIMESTAMPTZ DEFAULT now(),
    UNIQUE(sale_id, sku_id)
);

-- Partition pruning: all inventory queries filter by sale_id
CREATE INDEX idx_sale_inventory_sale_id ON sale_inventory(sale_id);
-- Lookup items by SKU and availability status for catalog browsing
CREATE INDEX idx_sale_inventory_sku_status ON sale_inventory(sku_id, status);

Coupon Campaigns Table

sql

CREATE TABLE coupon_campaigns (
    id                  UUID PRIMARY KEY DEFAULT gen_random_uuid(),
    sale_id             UUID REFERENCES flash_sales(id),
    name                VARCHAR(255) NOT NULL,
    coupon_type         VARCHAR(20) NOT NULL
                        CHECK (coupon_type IN ('percent_off', 'fixed_amount', 'bogo', 'free_shipping')),
    discount_value      DECIMAL(10,2),
    max_discount_cap    DECIMAL(10,2),
    min_cart_value      DECIMAL(10,2) DEFAULT 0,
    pool_size           INT NOT NULL,
    claimed_count       INT DEFAULT 0,
    one_per_user        BOOLEAN DEFAULT true,
    stackable           BOOLEAN DEFAULT false,
    stacking_priority   INT DEFAULT 0,
    applies_to          VARCHAR(20) DEFAULT 'all'
                        CHECK (applies_to IN ('all', 'specific_skus', 'category')),
    applicable_skus     TEXT[],
    valid_from          TIMESTAMPTZ NOT NULL,
    valid_until         TIMESTAMPTZ NOT NULL,
    status              VARCHAR(20) DEFAULT 'active'
                        CHECK (status IN ('active', 'exhausted', 'expired', 'disabled')),
    created_at          TIMESTAMPTZ DEFAULT now(),
    updated_at          TIMESTAMPTZ DEFAULT now()
);

-- Campaign lookup by sale for sale-specific coupon listings
CREATE INDEX idx_coupon_campaigns_sale ON coupon_campaigns(sale_id);
-- Filter active/exhausted campaigns for claim eligibility checks
CREATE INDEX idx_coupon_campaigns_status ON coupon_campaigns(status);
-- Filter by type for stacking rule evaluation
CREATE INDEX idx_coupon_campaigns_type ON coupon_campaigns(coupon_type);

Coupon Codes Table

sql

CREATE TABLE coupon_codes (
    id              UUID PRIMARY KEY DEFAULT gen_random_uuid(),
    campaign_id     UUID NOT NULL REFERENCES coupon_campaigns(id),
    code            VARCHAR(20) NOT NULL UNIQUE,
    status          VARCHAR(20) DEFAULT 'available'
                    CHECK (status IN ('available', 'claimed', 'redeemed', 'expired', 'returned')),
    claimed_by      UUID REFERENCES users(id),
    claimed_at      TIMESTAMPTZ,
    redeemed_at     TIMESTAMPTZ,
    order_id        UUID,
    created_at      TIMESTAMPTZ DEFAULT now()
);

-- Retrieve available codes per campaign for pool management
CREATE INDEX idx_coupon_codes_campaign ON coupon_codes(campaign_id, status);
-- Fast lookup by code string during claim validation
CREATE INDEX idx_coupon_codes_code ON coupon_codes(code);
-- List all codes claimed by a user for account page
CREATE INDEX idx_coupon_codes_claimed_by ON coupon_codes(claimed_by);

Coupon Claims Table (Dedup Ledger)

sql

CREATE TABLE coupon_claims (
    id              UUID PRIMARY KEY DEFAULT gen_random_uuid(),
    user_id         UUID NOT NULL REFERENCES users(id),
    campaign_id     UUID NOT NULL REFERENCES coupon_campaigns(id),
    coupon_code_id  UUID NOT NULL REFERENCES coupon_codes(id),
    code            VARCHAR(20) NOT NULL,
    status          VARCHAR(20) DEFAULT 'claimed'
                    CHECK (status IN ('claimed', 'applied', 'redeemed', 'rolled_back')),
    claimed_at      TIMESTAMPTZ DEFAULT now(),
    applied_at      TIMESTAMPTZ,
    rolled_back_at  TIMESTAMPTZ,
    UNIQUE(user_id, campaign_id)   -- THE critical constraint
);

-- Lookup all claims by a user for dedup and account page
CREATE INDEX idx_coupon_claims_user ON coupon_claims(user_id);
-- Aggregate claims per campaign for pool size monitoring
CREATE INDEX idx_coupon_claims_campaign ON coupon_claims(campaign_id);
-- Filter by status for rollback processing and reconciliation
CREATE INDEX idx_coupon_claims_status ON coupon_claims(status);

Orders Table

sql

CREATE TABLE orders (
    id                  UUID PRIMARY KEY DEFAULT gen_random_uuid(),
    sale_id             UUID NOT NULL REFERENCES flash_sales(id),
    user_id             UUID NOT NULL REFERENCES users(id),
    sku_id              VARCHAR(50) NOT NULL,
    quantity            INT NOT NULL DEFAULT 1,
    unit_price          DECIMAL(10,2) NOT NULL,
    subtotal            DECIMAL(10,2) NOT NULL,
    coupon_code         VARCHAR(20),
    coupon_discount     DECIMAL(10,2) DEFAULT 0,
    shipping_cost       DECIMAL(10,2) DEFAULT 0,
    total_amount        DECIMAL(10,2) NOT NULL,
    status              VARCHAR(20) DEFAULT 'pending'
                        CHECK (status IN (
                            'pending', 'inventory_reserved', 'coupon_applied',
                            'payment_processing', 'payment_failed',
                            'confirmed', 'shipped', 'delivered', 'cancelled', 'refunded'
                        )),
    payment_id          VARCHAR(100),
    payment_method      VARCHAR(50),
    saga_workflow_id    VARCHAR(255),
    failure_reason      TEXT,
    created_at          TIMESTAMPTZ DEFAULT now(),
    updated_at          TIMESTAMPTZ DEFAULT now()
) PARTITION BY LIST (sale_id);

-- Partition pruning: all order queries filter by sale_id
CREATE INDEX idx_orders_sale ON orders(sale_id);
-- User order history lookup
CREATE INDEX idx_orders_user ON orders(user_id);
-- Filter by status for reservation timeout scanner and dashboards
CREATE INDEX idx_orders_status ON orders(status);
-- Time-range queries for analytics and reconciliation
CREATE INDEX idx_orders_created ON orders(created_at);

Valkey Key Patterns

Key Pattern	Type	Purpose	TTL
`inv:{sale_id}:{sku_id}`	String (integer)	Available inventory count	Sale duration + 1 hour
`inv:reserved:{sale_id}:{sku_id}`	String (integer)	Reserved (pending payment) count	Sale duration + 1 hour
`coupon:pool:{campaign_id}`	List	Pre-generated coupon codes (LPOP to claim)	Campaign validity + 1 hour
`coupon:claimed:{campaign_id}`	Set	Set of user_ids who claimed from this campaign	Campaign validity + 1 day
`coupon:user:{user_id}:{campaign_id}`	String	SET NX guard; value = coupon_code	Campaign validity + 1 day
`coupon:code:{code}`	Hash	Coupon code details (campaign_id, type, value, status)	Campaign validity + 1 day
`queue:{sale_id}`	Sorted Set	Queue tokens scored by join timestamp	Sale duration + 1 hour
`queue:admitted:{sale_id}`	Set	Set of user_ids currently admitted	Sale duration + 1 hour
`queue:position:{sale_id}`	String (integer)	Current admission cursor position	Sale duration + 1 hour
`cart:{user_id}:{sale_id}`	Hash	Temporary cart: sku_id, quantity, hold_expiry	10 minutes
`sale:meta:{sale_id}`	Hash	Cached sale metadata (name, start, end, status)	Sale duration + 1 hour
`rate:{user_id}:checkout`	String (counter)	Rate limit: max 5 checkout attempts per minute	60 seconds

Storage Tiering

Flash sale data has three distinct access patterns with different latency and durability requirements.

Tier	Technology	Data	Access Pattern
Hot (during sale)	Valkey	inventory counts, coupon pool, queue positions, session state	Every request, sub-ms
Warm	PostgreSQL	orders, coupon claims, sale config, inventory ledger	Checkout writes, post-sale queries
Cold	S3 + Parquet	completed sale archives, audit logs, analytics exports	Post-sale analysis, compliance

Hot tier data lives in Valkey for the sale duration plus a 1-hour buffer. After the reconciliation job confirms Valkey and PostgreSQL are consistent, hot tier keys are left to expire via TTL. Warm tier data remains in PostgreSQL for 2 years (financial compliance). Cold tier data is exported via a nightly batch job that converts completed sale partitions into Parquet files on S3, where Athena queries support post-hoc analysis at $5/TB scanned.

Schema Evolution

Flash sale tables must evolve without downtime. The migration strategy varies by table criticality.

General approach: Add nullable columns first, deploy code that writes to both old and new columns, backfill existing rows, then add NOT NULL constraints once all rows have values. This three-phase approach avoids ALTER TABLE ... ADD COLUMN NOT NULL which acquires an ACCESS EXCLUSIVE lock on the entire table.

coupon_claims table changes: This table receives the highest write contention during sales. Schema changes use a dual-write period: the old schema and new schema coexist, with the application writing to both. After one full sale cycle confirms compatibility, the old columns are dropped.

sale_inventory uses soft deletes: The status column supports 'disabled' as a value rather than using DELETE statements. This avoids ALTER TABLE or heavy DELETE operations on a hot table during an active sale. Disabled rows are cleaned up post-sale by the archival job.

Online DDL for large tables: pg_repack handles table restructuring without holding long locks. For coupon_codes (potentially millions of rows), pg_repack repacks the table and indexes online with minimal locking.

Partitioning Strategy

The data model defines what is stored. The partitioning strategy determines how data is distributed across nodes to prevent hot spots and enable horizontal scaling.

Valkey Inventory Sharding

Inventory keys use hash-slot sharding: hash(sku_id) mod 16 maps each SKU to one of 16 virtual slots. This isolates hot keys and spreads load across Valkey cluster shards. The Lua scripts for inventory decrement reference the shard keys directly (detailed in the Atomic Inventory Control deep dive).

Risk: A doorbuster SKU with 1 SKU concentrates all traffic on 1 virtual slot. The 16-slot fan-out mitigates this -- 16 shards each hold a fraction of the inventory, and the atomic decrement script scans shards sequentially. For standard SKUs with low contention, a single shard (no fan-out) is used, avoiding unnecessary multi-key coordination.

PostgreSQL Order Partitioning

The orders table is partitioned by sale_id using PARTITION BY LIST. Each flash sale is an independent event, and queries never cross sales. Partition pruning eliminates full-table scans -- a query for WHERE sale_id = 'sale_abc123' touches only the partition for that sale. After a sale concludes and reconciliation passes, the partition can be detached and archived to cold storage in a sub-second operation that holds an exclusive lock only briefly.

Kafka Partition Strategy

Order events are partitioned by order_id. All events for a single order (created, paid, confirmed) land on the same partition and are processed in order. Cross-order ordering is not required.

Partition count: 24. Math: peak throughput is 833 orders/sec x 3 events/order = 2,500 events/sec. Target throughput per partition is ~50 msg/sec for comfortable consumer processing. 2,500 / 50 = 50, but sustained throughput is lower than burst. Using the 5x burst figure: 12,500 msg/sec / 600 msg/sec/partition = 20.8, rounded to 24 for headroom.

Queue Sorted Set Partitioning

The queue uses a single Valkey key per sale: queue:{sale_id}. At 10M members, ZRANGEBYSCORE is O(log N + M) where M is the batch size (100 at a time) -- acceptable for batch admission. The admission controller reads 100 members per cycle, making M small relative to N.

If queue size exceeds 50M members (a 5x scale scenario), the queue is sharded by hash(user_id) mod 4 into four queue keys: queue:{sale_id}:0 through queue:{sale_id}:3. The admission controller round-robins across shards, admitting 25 users per shard per cycle to maintain FIFO fairness across shards.

Rebalancing

Valkey cluster rebalancing is online -- slot migration moves data between nodes without downtime. PostgreSQL partition attach/detach operations hold a sub-second exclusive lock. Kafka partition reassignment uses cruise-control for automated, throttled rebalancing that avoids consumer disruption during an active sale.

Caching Strategy

The partitioning strategy distributes data. The caching strategy determines which data is served from which layer and at what freshness, keeping the hot path off the database and the origin servers.

Cache Topology

Layer	Technology	Data	TTL	Invalidation
CDN	CloudFront	sale landing page, product images, JS/CSS	60s	Invalidation API at T-0 for fresh sale page
L1 (in-process)	Node.js Map	sale config, category list	30s	Process restart or TTL
L2 (distributed)	Valkey	inventory counts (approximate), queue positions, sale metadata	sale duration + 1h	Event-driven on inventory change
Source of truth	PostgreSQL	all durable state	permanent	N/A

Invalidation Model

Inventory counts in Valkey ARE the source of truth during the sale, not a cache. The Lua decrement script mutates Valkey directly; PostgreSQL follows asynchronously via Kafka. Queue positions update every 5s via WebSocket push. Sale metadata is cached with TTL = sale duration + 1 hour; changes to sale status (e.g., early termination) trigger explicit key deletion.

Thundering Herd at T-0

10M users hit /api/v1/sales/{id} simultaneously at sale start. CDN absorbs 99% of reads -- the static sale page is pre-deployed with a 60s TTL. Dynamic inventory counts are served from Valkey (sub-ms reads). The queue join endpoint (POST /api/v1/queue/join) is not cacheable -- the Valkey sorted set handles the write load directly at 200K+ ops/sec, well above the 10M requests spread across several seconds of clock skew.

Cache Warming (T-24h)

Pre-load sale metadata, product catalog, and coupon pools into Valkey 24 hours before the sale. Warm CDN by requesting all sale pages via synthetic traffic from multiple edge locations. Verify Valkey memory usage is below 70% threshold to leave headroom for queue tokens (1 GB at 10M users). If memory exceeds 70%, scale the Valkey cluster before the sale starts.

Eviction Policy

Valkey uses maxmemory-policy volatile-lru -- only keys with a TTL set are eligible for eviction. Inventory keys have no TTL during the sale because they are the source of truth, not a cache. Queue keys expire 1 hour after the sale ends. Session and rate-limit keys have short TTLs and are evicted first under memory pressure.

Cache-DB Consistency

During the sale, Valkey leads and PostgreSQL follows (async via Kafka). After the sale, the reconciliation job compares Valkey inventory counts with the PostgreSQL ledger. Any mismatch triggers an alert and manual review. This is the inverse of typical cache patterns -- Valkey is NOT a cache for inventory during the sale; it IS the hot-path store. PostgreSQL is the cold-path store that provides durability and post-sale querying.

Consistency Model

The caching strategy inverts the typical cache-DB relationship for inventory. This section makes the consistency guarantees explicit for every operation in the system.

Per-Operation Guarantee

Operation	Guarantee	Mechanism
Inventory decrement	Linearizable	Valkey Lua script (single-threaded, atomic)
Coupon claim	Linearizable	Valkey LPOP + SET NX (atomic per shard)
Queue join	Linearizable	Valkey ZADD (atomic sorted set insert)
Order creation	Serializable	PostgreSQL transaction with UNIQUE constraints
Payment processing	At-most-once	Idempotency key prevents double-charge
Inventory sync to PG	Eventually consistent	Kafka consumer, ~5s lag
Analytics events	At-least-once	Kafka acks=all, idempotent consumer
Queue position read	Eventually consistent	WebSocket push every 5s

CAP/PACELC Analysis

The system is CP for inventory operations -- it sacrifices availability rather than risk overselling. If a Valkey shard becomes unreachable, the admission controller pauses queue advancement for affected SKUs rather than falling back to a less-consistent store. For read operations (queue position, sale catalog), the system is AP -- it serves stale data from CDN/cache rather than returning errors.

Under the PACELC "else" clause (when there is no partition): the system chooses consistency over latency for writes. The Lua script adds ~1ms overhead versus a raw SET, but that 1ms buys atomic check-and-decrement. For reads, the system chooses latency over consistency -- CDN serves 60s stale sale pages, and queue positions are pushed every 5s rather than fetched on every render.

Saga Consistency

The checkout saga provides eventual consistency across services. Each step is independently atomic: inventory decrement is linearizable in Valkey, payment capture is at-most-once via idempotency key, and order creation is serializable in PostgreSQL. Compensating transactions restore invariants on failure. The Temporal workflow engine guarantees exactly-once execution of the saga steps, ensuring that even if a worker crashes mid-saga, the workflow resumes from the last completed step without repeating side effects.

Technology Selection

The consistency and caching requirements above constrain technology choices. Each component must meet specific guarantees under extreme concurrency.

Inventory Management Approach Comparison

Approach	How It Works	Throughput	Overselling Risk	Complexity
PostgreSQL SELECT FOR UPDATE	Row lock on inventory row, decrement, commit	~2,000 ops/sec per row	Low (if done correctly)	Low
PostgreSQL Advisory Lock	Application-level lock per SKU, then update	~5,000 ops/sec	Low	Medium
Valkey Atomic DECR	`DECR sku:inventory:123`, check if >= 0	~200,000 ops/sec	Zero (atomic)	Low
Valkey Lua Script	Atomic check-and-decrement in single eval	~150,000 ops/sec	Zero (atomic + conditional)	Medium
Queue-Based (Kafka)	Serialize all decrement requests through a partition	~50,000 ops/sec per partition	Zero (serialized)	High

Decision: Valkey Lua Script. The Lua script approach gives atomic check-and-decrement with conditional logic (check quantity > 0 before decrementing) at 150K+ ops/sec. PostgreSQL serves as the durable ledger; Valkey is the fast path.

Technology Stack

Component	Technology	Why
Inventory Cache & Atomics	Valkey 8.x (Redis-compatible)	Atomic Lua scripts, 200K+ ops/sec, cluster mode
Coupon Pool	Valkey List + Set	LPOP for atomic claim, SET NX for user dedup
Primary Database	PostgreSQL 16	ACID transactions, UNIQUE constraints, battle-tested
Event Streaming	Kafka	Decouple order processing, exactly-once semantics
Saga Orchestration	Temporal	Durable workflow execution, automatic retry, compensation
CDN	CloudFront / Cloudflare	Static sale page, waiting room, DDoS protection
Real-Time Updates	WebSocket (via Socket.io)	Queue position updates, inventory status
API Gateway	Kong	Sub-ms plugin overhead vs 10-30ms Lambda authorizer cold starts on AWS API Gateway. Native JWT validation, sliding-window rate limiting, and request transformation run as in-process Lua plugins -- no external hop. Handles 100K+ req/sec per node on commodity hardware. Deploys on Kubernetes alongside application pods, avoiding vendor lock-in and enabling per-route traffic policies that ALB cannot express.
Container Orchestration	Kubernetes (EKS)	Auto-scaling pods based on queue depth
Monitoring	Prometheus + Grafana	Real-time dashboards for sale metrics
Object Storage	S3	Product images, sale banners
Search	Elasticsearch	Product search during sale (if needed)

Queue and Stream Capacity Planning

The technology stack identifies Kafka as the event backbone. The capacity plan below ensures the Kafka cluster handles burst throughput without becoming a bottleneck during peak sale minutes.

Kafka Cluster Sizing

Write throughput: 833 orders/sec x 3 events/order (created, paid, confirmed) = 2,500 events/sec. Average event size is 0.5KB, yielding 1.25MB/sec sustained write throughput.

Peak (first 5 minutes of sale): Traffic concentrates in the opening burst. At 5x sustained rate: 12,500 events/sec = 6.25MB/sec.

Partition count: 24 partitions. Math: peak 12,500 msg/sec / 600 msg/sec/partition (comfortable consumer throughput) = 20.8, rounded to 24 for burst headroom.

Replication factor: 3 across 3 brokers. In-sync replica minimum: 2. This ensures no data loss if one broker fails.

Retention: 7 days. Storage: 1.25MB/sec x 86,400 sec/day x 7 days x 3 replicas = 2.3TB. The 7-day retention window allows replay for reconciliation and audit. Completed events older than 7 days are already persisted in PostgreSQL and archived to S3.

Consumer Groups

Three consumer groups process events independently:

inventory-sync -- writes inventory changes to PostgreSQL. Highest priority.
analytics -- feeds real-time dashboards and post-sale reports.
notifications -- triggers email/push confirmations to buyers.

Consumer Lag Thresholds

Lag Level	Threshold	Action
Normal	< 5K messages	No action
Warning	5K messages (~2s lag)	Alert on-call, monitor trend
Critical	50K messages (~20s lag)	Auto-scale consumers. If inventory-sync lag exceeds 10s, pause analytics and notification consumers to free broker resources

Dead Letter Queue

The orders.dlq topic receives messages that fail 3 consecutive processing retries. Common failure causes: invalid order state transitions (e.g., confirming an already-cancelled order) and payment callback schema mismatches. The DLQ is reviewed within 1 hour during an active sale and daily otherwise. A Grafana alert fires when DLQ depth exceeds 10 messages.

Ordering Guarantees

Events are keyed by order_id. All events for a single order land on the same partition and are processed in order. Cross-order ordering is not required and would reduce parallelism.

Backpressure

Kafka absorbs burst traffic by design -- producers write faster than consumers read, and the log buffers the difference. If consumer lag exceeds 50K messages, the admission controller reduces queue advancement rate by 50%, admitting fewer users per batch. Fewer new checkouts means fewer new events, creating end-to-end backpressure from Kafka consumer lag through to user-facing queue wait time. This feedback loop prevents unbounded lag growth.

System Flows

The capacity plan ensures the event backbone handles peak load. The sequence diagrams below show how components interact end-to-end under normal operation and failure.

Queue Join and Admission

Checkout Happy Path

Checkout Failure with Compensation

Cache Hit/Miss Read Path

These flows show the system end-to-end under normal operation and failure. The deep dives that follow explain how each component achieves its guarantees.

Virtual Queue and Traffic Shaping

The schema and API define the system's surface area. Each deep dive below explains how one component enforces its invariants, starting with how 10M users enter the system without overwhelming it.

The first thing that happens at T-0 is 10M users arriving simultaneously. The queue is not about fairness. It is about protecting the backend. Without it, every other component fails under load -- inventory atomics, coupon claims, and payment processing all collapse under uncontrolled traffic.

Problem: 10M users hit the checkout endpoint simultaneously. Even with Valkey handling 200K ops/sec, the backend services (payment gateway, PostgreSQL writes, Kafka publishing) cannot absorb 10M concurrent requests.

Simple example: A nightclub with capacity for 500 people. 10,000 show up at opening. A bouncer at the door lets in groups of 50 every 2 minutes. People outside wait in a line with a "estimated wait: 20 minutes" sign.

Mental model: The queue is a flow control valve between the internet and the backend. It converts a traffic spike into a smooth, sustained load that the system can handle.

Queue Architecture

Queue Join Flow

When a user arrives at the sale page:

typescript

// queue-service.ts
import { Redis } from 'ioredis';
import { v4 as uuidv4 } from 'uuid';

const valkey = new Redis({ host: 'valkey-cluster', port: 6379 });

interface QueueJoinResult {
  queueToken: string;
  position: number;
  estimatedWaitSeconds: number;
  websocketUrl: string;
}

async function joinQueue(saleId: string, userId: string): Promise<QueueJoinResult> {
  const queueKey = `queue:${saleId}`;
  const now = Date.now();
  const token = `qt_${uuidv4().replace(/-/g, '').substring(0, 16)}`;

  // Add to sorted set with timestamp as score
  // If user already in queue, this is idempotent (same score)
  const member = `${userId}:${token}`;
  await valkey.zadd(queueKey, now, member);

  // Get position (0-indexed rank)
  const rank = await valkey.zrank(queueKey, member);
  const position = (rank ?? 0) + 1;

  // Get current admission cursor
  const cursor = parseInt(await valkey.get(`queue:position:${saleId}`) || '0');
  const usersAhead = Math.max(0, position - cursor);

  // Estimate wait time based on admission rate
  const batchSize = 1000;
  const batchIntervalSec = 2;
  const estimatedWaitSeconds = Math.ceil(usersAhead / batchSize) * batchIntervalSec;

  return {
    queueToken: token,
    position,
    estimatedWaitSeconds,
    websocketUrl: `wss://ws.flashsale.example.com/queue/${token}`
  };
}

Admission Controller

A background process runs every 2 seconds, admitting the next batch of users:

typescript

// admission-controller.ts

interface AdmissionConfig {
  batchSize: number;
  intervalMs: number;
  maxConcurrentAdmitted: number;
}

async function runAdmissionLoop(saleId: string, config: AdmissionConfig): Promise<void> {
  const queueKey = `queue:${saleId}`;
  const admittedKey = `queue:admitted:${saleId}`;
  const cursorKey = `queue:position:${saleId}`;

  while (true) {
    // Check if sale is still active
    const saleStatus = await valkey.hget(`sale:meta:${saleId}`, 'status');
    if (saleStatus !== 'active') break;

    // Check current admitted count (users actively shopping)
    const admittedCount = await valkey.scard(admittedKey);

    // Only admit more if below the concurrent cap
    if (admittedCount >= config.maxConcurrentAdmitted) {
      await sleep(config.intervalMs);
      continue;
    }

    const spotsAvailable = Math.min(
      config.batchSize,
      config.maxConcurrentAdmitted - admittedCount
    );

    // Get next batch from sorted set (by join time, FIFO)
    const cursor = parseInt(await valkey.get(cursorKey) || '0');
    const members = await valkey.zrange(queueKey, cursor, cursor + spotsAvailable - 1);

    if (members.length === 0) {
      await sleep(config.intervalMs);
      continue;
    }

    // Admit users
    const pipeline = valkey.pipeline();
    for (const member of members) {
      const userId = member.split(':')[0];
      pipeline.sadd(admittedKey, userId);
    }
    pipeline.set(cursorKey, (cursor + members.length).toString());
    await pipeline.exec();

    // Notify admitted users via WebSocket
    for (const member of members) {
      const userId = member.split(':')[0];
      const token = member.split(':')[1];
      await notifyAdmission(userId, token, saleId);
    }

    console.log(`Admitted ${members.length} users. Total admitted: ${admittedCount + members.length}`);

    await sleep(config.intervalMs);
  }
}

function sleep(ms: number): Promise<void> {
  return new Promise(resolve => setTimeout(resolve, ms));
}

Queue Position Updates via WebSocket

Users need to see their position and estimated wait time:

typescript

// queue-websocket.ts
import { WebSocket, WebSocketServer } from 'ws';

const wss = new WebSocketServer({ port: 8080, path: '/queue' });

const userSockets = new Map<string, WebSocket>();

wss.on('connection', (ws, req) => {
  const token = req.url?.split('/queue/')[1];
  if (token) {
    userSockets.set(token, ws);
  }

  ws.on('close', () => {
    if (token) userSockets.delete(token);
  });
});

interface QueueUpdate {
  type: 'position_update' | 'admitted' | 'sale_ended';
  position?: number;
  estimatedWaitSeconds?: number;
  accessToken?: string;
  expiresAt?: string;
}

async function notifyAdmission(userId: string, token: string, saleId: string): Promise<void> {
  const ws = userSockets.get(token);
  if (!ws || ws.readyState !== WebSocket.OPEN) return;

  // Generate a short-lived access token for the admitted user
  const accessToken = generateAccessToken(userId, saleId);

  const update: QueueUpdate = {
    type: 'admitted',
    accessToken,
    expiresAt: new Date(Date.now() + 15 * 60 * 1000).toISOString()
  };

  ws.send(JSON.stringify(update));
}

// Periodic position broadcast (every 5 seconds)
async function broadcastPositions(saleId: string): Promise<void> {
  const cursorKey = `queue:position:${saleId}`;
  const cursor = parseInt(await valkey.get(cursorKey) || '0');
  const batchSize = 1000;
  const batchIntervalSec = 2;

  for (const [token, ws] of userSockets) {
    if (ws.readyState !== WebSocket.OPEN) continue;

    const userId = await getUserIdFromToken(token);
    const rank = await valkey.zrank(`queue:${saleId}`, `${userId}:${token}`);

    if (rank === null) continue;

    const position = rank + 1;
    const usersAhead = Math.max(0, position - cursor);
    const estimatedWait = Math.ceil(usersAhead / batchSize) * batchIntervalSec;

    const update: QueueUpdate = {
      type: 'position_update',
      position,
      estimatedWaitSeconds: estimatedWait
    };

    ws.send(JSON.stringify(update));
  }
}

CDN-Based Waiting Room

The waiting room page itself must be served from CDN, not origin. If 10M users hit origin to load the queue page, the system has already failed.

Client --> CDN (cached waiting-room.html)
                |
                +--> Inline JS polls /api/v1/queue/status (origin)
                |    (rate limited to 1 request per 2 seconds per client)
                |
                +--> WebSocket to ws.flashsale.example.com
                     (for push-based updates)

Key design decision: The waiting room HTML/JS/CSS is deployed to CDN 24 hours before the sale. The countdown timer runs client-side. At T-0, the JS initiates the queue join request. This means 10M users load static assets from CDN, and only the queue join API call hits origin -- spread across a few seconds of clock skew.

Adaptive Admission Rate

The admission controller adjusts batch size based on backend health:

typescript

async function getAdaptiveBatchSize(saleId: string, baseBatchSize: number): Promise<number> {
  // Check checkout service error rate
  const errorRate = await getMetric('checkout_error_rate_5m');
  // Check Valkey latency
  const valkeyP99 = await getMetric('valkey_latency_p99_ms');
  // Check PostgreSQL connection pool usage
  const pgPoolUsage = await getMetric('pg_pool_usage_percent');

  let multiplier = 1.0;

  if (errorRate > 0.05) multiplier *= 0.5;      // 5%+ errors: halve admission
  if (valkeyP99 > 10) multiplier *= 0.7;        // Valkey slow: reduce 30%
  if (pgPoolUsage > 0.8) multiplier *= 0.6;     // DB stressed: reduce 40%

  const adaptiveBatch = Math.max(100, Math.floor(baseBatchSize * multiplier));

  console.log(`Adaptive batch size: ${adaptiveBatch} (base: ${baseBatchSize}, multiplier: ${multiplier.toFixed(2)})`);
  return adaptiveBatch;
}

Atomic Inventory Control

With the queue controlling admission, the next problem is ensuring that two admitted users never purchase the same last unit. This is the core correctness requirement.

Problem: Multiple concurrent requests try to decrement the same inventory counter. An operation is needed that atomically checks "is quantity > 0?" and decrements in a single step, with no window for a race condition.

Simple example: Two cashiers at a store, one item left. Customer A asks "is it available?" -- yes. Customer B asks "is it available?" -- yes. Both ring it up. Two sales, one item. The fix: a single cashier with a locked register.

Mental model: Valkey executes Lua scripts atomically. No other command can interleave during script execution. This gives a compare-and-set primitive without distributed locks -- the equivalent of a single-threaded cashier.

Valkey Lua Atomic Decrement

lua

-- inventory_decrement.lua
-- KEYS[1] = inv:{sale_id}:{sku_id}
-- KEYS[2] = inv:reserved:{sale_id}:{sku_id}
-- ARGV[1] = quantity requested
-- Returns: 1 if success, 0 if insufficient, -1 if key missing

local available = redis.call('GET', KEYS[1])
if available == false then
    return -1  -- key does not exist, sale item not loaded
end

local avail = tonumber(available)
local requested = tonumber(ARGV[1])

if avail < requested then
    return 0  -- insufficient inventory
end

-- Atomic decrement available, increment reserved
redis.call('DECRBY', KEYS[1], requested)
redis.call('INCRBY', KEYS[2], requested)
return 1  -- success

Calling the Lua Script from the Inventory Service

typescript

// inventory-service.ts
import { Redis } from 'ioredis';

const valkey = new Redis({ host: 'valkey-cluster', port: 6379 });

// Load script once, use SHA for subsequent calls
const DECREMENT_SCRIPT = `
local available = redis.call('GET', KEYS[1])
if available == false then return -1 end
local avail = tonumber(available)
local requested = tonumber(ARGV[1])
if avail < requested then return 0 end
redis.call('DECRBY', KEYS[1], requested)
redis.call('INCRBY', KEYS[2], requested)
return 1
`;

let scriptSha: string;

async function loadScript(): Promise<void> {
  scriptSha = await valkey.script('LOAD', DECREMENT_SCRIPT) as string;
}

async function reserveInventory(
  saleId: string,
  skuId: string,
  quantity: number
): Promise<'success' | 'sold_out' | 'not_found'> {
  const result = await valkey.evalsha(
    scriptSha,
    2,
    `inv:${saleId}:${skuId}`,
    `inv:reserved:${saleId}:${skuId}`,
    quantity.toString()
  );

  switch (result) {
    case 1: return 'success';
    case 0: return 'sold_out';
    case -1: return 'not_found';
    default: return 'not_found';
  }
}

Inventory Release (Compensation)

When payment fails, reserved inventory must be returned:

lua

-- inventory_release.lua
-- KEYS[1] = inv:{sale_id}:{sku_id}
-- KEYS[2] = inv:reserved:{sale_id}:{sku_id}
-- ARGV[1] = quantity to release

local reserved = tonumber(redis.call('GET', KEYS[2]) or '0')
local to_release = tonumber(ARGV[1])

if to_release > reserved then
    to_release = reserved  -- safety: never release more than reserved
end

redis.call('INCRBY', KEYS[1], to_release)
redis.call('DECRBY', KEYS[2], to_release)
return to_release

Approach Comparison

Aspect	Pessimistic (SELECT FOR UPDATE)	Optimistic (Version Column)	Valkey Lua (chosen)
Lock type	Row-level exclusive lock	No lock; retry on version mismatch	No lock; atomic script
Contention behavior	Requests queue behind lock	Requests retry (exponential backoff)	Requests complete immediately
Throughput at 1K concurrent	~500 ops/sec	~2,000 ops/sec (with retries)	~150,000 ops/sec
Deadlock risk	Yes (multi-row operations)	No	No
Starvation risk	Yes (long-held locks)	Yes (repeated retries)	No
Durability	Immediate (committed to DB)	Immediate (committed to DB)	Eventual (async sync to DB)
Complexity	Low	Medium (retry logic)	Medium (Lua scripting)

Tradeoff: Valkey Lua gives 75x the throughput of PostgreSQL row locks, but inventory state is in memory, not durable. The design accepts this: Valkey is the source of truth during the sale, and Kafka carries inventory events to PostgreSQL asynchronously. A post-sale reconciliation job catches any discrepancies.

Hot-Key Mitigation for Popular SKUs

A doorbuster item (e.g., $99 laptop) will have all traffic targeting a single Valkey key. Even though Valkey is single-threaded and fast, a single key on a single shard becomes a bottleneck in cluster mode.

Strategy: Inventory Sharding. Split a single SKU's inventory across multiple virtual slots:

lua

-- inventory_decrement_sharded.lua
-- KEYS[1..N] = inv:{sale_id}:{sku_id}:shard:{0..N-1}
-- ARGV[1] = quantity requested

local requested = tonumber(ARGV[1])
local total_available = 0

-- First pass: sum available across shards
for i = 1, #KEYS do
    local avail = tonumber(redis.call('GET', KEYS[i]) or '0')
    total_available = total_available + avail
end

if total_available < requested then
    return 0
end

-- Second pass: decrement from first shard with availability
local remaining = requested
for i = 1, #KEYS do
    if remaining <= 0 then break end
    local avail = tonumber(redis.call('GET', KEYS[i]) or '0')
    if avail > 0 then
        local take = math.min(avail, remaining)
        redis.call('DECRBY', KEYS[i], take)
        remaining = remaining - take
    end
end

return 1

The client hashes to a random shard on each attempt, distributing load:

typescript

function getInventoryShardKeys(saleId: string, skuId: string, shardCount: number): string[] {
  return Array.from({ length: shardCount }, (_, i) =>
    `inv:${saleId}:${skuId}:shard:${i}`
  );
}

// For doorbuster items: 16 shards
// For standard items: 1 shard (no sharding needed)
const shardCount = isDoorBuster(skuId) ? 16 : 1;
const keys = getInventoryShardKeys(saleId, skuId, shardCount);

Reservation Timeout (FR-17)

The saga compensates on payment failure, but what if the user closes their browser after inventory is reserved? The saga never completes, and inventory stays locked. A background job runs every 60 seconds, scanning for reservations older than 10 minutes:

typescript

async function reclaimExpiredReservations(saleId: string): Promise<number> {
  const expiredOrders = await db.query(
    `SELECT id, sku_id, quantity FROM orders
     WHERE sale_id = $1 AND status = 'inventory_reserved'
     AND created_at < now() - interval '10 minutes'`,
    [saleId]
  );

  let reclaimed = 0;
  for (const order of expiredOrders.rows) {
    await inventory.releaseInventory(saleId, order.sku_id, order.quantity);
    await db.query(
      `UPDATE orders SET status = 'expired', failure_reason = 'reservation_timeout'
       WHERE id = $1`,
      [order.id]
    );
    reclaimed += order.quantity;
  }
  return reclaimed;
}

Without this, inventory leaks on every sale. At 200K orders with a 5% abandonment rate, 10K units would be permanently locked.

Async Sync to PostgreSQL

Valkey is the source of truth during the sale for inventory counts. PostgreSQL provides durability and post-sale reconciliation.

Coupon System: Pool Management and One-Per-User Enforcement

With inventory atomics solved, the coupon system introduces a different constraint: not just atomic decrement, but also per-user uniqueness. A user must never claim more than one coupon per campaign, even across multiple tabs, devices, and sessions.

Problem: 500,000 unique coupon codes must be distributed to users, one per user, at high speed. Two sub-problems: (1) no two users should receive the same code, and (2) no single user should receive two codes.

Simple example: A stack of 500,000 gift cards on a table. A single clerk hands them out. Each person must show their ID. The clerk checks a list before giving a card -- if the name is already on the list, the person is turned away.

Mental model: The coupon pool is a Valkey list (LPOP for atomic claim). The user dedup is a Valkey SET NX (atomic set-if-not-exists). These two atomic operations run in a single Lua script, making the entire claim operation atomic.

Pre-Loading the Coupon Pool

Before the sale starts, coupon codes are generated and loaded into a Valkey list:

typescript

// coupon-loader.ts
import { v4 as uuidv4 } from 'uuid';
import { Redis } from 'ioredis';

const valkey = new Redis({ host: 'valkey-cluster', port: 6379 });

interface CouponCampaign {
  id: string;
  poolSize: number;
  prefix: string;
  validUntil: Date;
}

async function loadCouponPool(campaign: CouponCampaign): Promise<void> {
  const codes: string[] = [];

  // Generate unique codes
  for (let i = 0; i < campaign.poolSize; i++) {
    const suffix = uuidv4().replace(/-/g, '').substring(0, 8).toUpperCase();
    const code = `${campaign.prefix}-${suffix}`;
    codes.push(code);
  }

  // Load into Valkey list in batches
  const BATCH_SIZE = 10000;
  const pipeline = valkey.pipeline();

  for (let i = 0; i < codes.length; i += BATCH_SIZE) {
    const batch = codes.slice(i, i + BATCH_SIZE);
    pipeline.rpush(`coupon:pool:${campaign.id}`, ...batch);
  }

  // Set TTL
  const ttlSeconds = Math.ceil((campaign.validUntil.getTime() - Date.now()) / 1000) + 86400;
  pipeline.expire(`coupon:pool:${campaign.id}`, ttlSeconds);

  await pipeline.exec();

  // Also insert into PostgreSQL for durability
  // ... batch INSERT into coupon_codes table
  console.log(`Loaded ${codes.length} codes for campaign ${campaign.id}`);
}

Atomic Coupon Claim with LPOP + SET NX

The Valkey LPOP command is atomic. Only one client can pop a given element. Combined with SET NX for user dedup, the entire claim is a single atomic Lua script:

lua

-- coupon_claim.lua
-- KEYS[1] = coupon:pool:{campaign_id}       (list of available codes)
-- KEYS[2] = coupon:claimed:{campaign_id}     (set of user_ids who claimed)
-- KEYS[3] = coupon:user:{user_id}:{campaign_id}  (SET NX guard)
-- ARGV[1] = user_id
-- ARGV[2] = TTL in seconds
-- Returns: coupon_code string, or "ALREADY_CLAIMED", or "POOL_EXHAUSTED"

-- Step 1: Check if user already claimed (fast path)
local existing = redis.call('GET', KEYS[3])
if existing ~= false then
    return 'ALREADY_CLAIMED'
end

-- Step 2: Check if user is in claimed set (belt and suspenders)
local isMember = redis.call('SISMEMBER', KEYS[2], ARGV[1])
if isMember == 1 then
    return 'ALREADY_CLAIMED'
end

-- Step 3: Pop a code from the pool
local code = redis.call('LPOP', KEYS[1])
if code == false then
    return 'POOL_EXHAUSTED'
end

-- Step 4: Mark user as claimed
redis.call('SET', KEYS[3], code, 'EX', tonumber(ARGV[2]))
redis.call('SADD', KEYS[2], ARGV[1])

return code

Coupon Claim Flow

Coupon Service with Pool Exhaustion Handling

typescript

// coupon-service.ts
async function claimCoupon(userId: string, campaignId: string): Promise<ClaimResult> {
  const result = await valkey.evalsha(
    couponClaimSha,
    3,
    `coupon:pool:${campaignId}`,
    `coupon:claimed:${campaignId}`,
    `coupon:user:${userId}:${campaignId}`,
    userId,
    '86400'  // 24-hour TTL
  );

  if (result === 'ALREADY_CLAIMED') {
    // Retrieve their existing code
    const existingCode = await valkey.get(`coupon:user:${userId}:${campaignId}`);
    return {
      status: 'already_claimed',
      code: existingCode || undefined,
      message: 'You have already claimed a coupon from this campaign'
    };
  }

  if (result === 'POOL_EXHAUSTED') {
    // Update campaign status in DB
    await db.query(
      `UPDATE coupon_campaigns SET status = 'exhausted' WHERE id = $1 AND status = 'active'`,
      [campaignId]
    );

    // Publish event for real-time UI update
    await kafka.send({
      topic: 'coupon-events',
      messages: [{ key: campaignId, value: JSON.stringify({ event: 'pool_exhausted', campaignId }) }]
    });

    return { status: 'pool_exhausted', message: 'All coupons have been claimed' };
  }

  // Success: persist to PostgreSQL
  const code = result as string;
  try {
    await db.query(
      `INSERT INTO coupon_claims (user_id, campaign_id, coupon_code_id, code, status)
       VALUES ($1, $2, (SELECT id FROM coupon_codes WHERE code = $3), $3, 'claimed')`,
      [userId, campaignId, code]
    );
  } catch (err: any) {
    if (err.code === '23505') {
      // UNIQUE violation: user already claimed at DB level
      // Return the coupon to the pool
      await valkey.rpush(`coupon:pool:${campaignId}`, code);
      await valkey.del(`coupon:user:${userId}:${campaignId}`);
      return { status: 'already_claimed', message: 'Duplicate claim detected' };
    }
    throw err;
  }

  return {
    status: 'success',
    code,
    campaignId,
    message: 'Coupon claimed successfully'
  };
}

Coupon Return (On Payment Failure or Expiry)

lua

-- coupon_return.lua
-- KEYS[1] = coupon:pool:{campaign_id}
-- KEYS[2] = coupon:claimed:{campaign_id}
-- KEYS[3] = coupon:user:{user_id}:{campaign_id}
-- ARGV[1] = user_id
-- ARGV[2] = coupon_code

-- Remove user from claimed set
redis.call('SREM', KEYS[2], ARGV[1])

-- Delete user's claim key
redis.call('DEL', KEYS[3])

-- Return code to pool (push to right end)
redis.call('RPUSH', KEYS[1], ARGV[2])

return 1

Two-Layer Defense: Valkey + PostgreSQL

This is the most critical correctness requirement. A single user must never claim more than one coupon per campaign, even across multiple browser tabs, bots, or devices.

Layer 1: Valkey SET NX (Fast Path). The SET key value NX command sets the key only if it does not already exist. This is atomic. The Lua script checks coupon:user:{user_id}:{campaign_id} before doing anything else.

Layer 2: PostgreSQL UNIQUE Constraint (Durable Backstop). The coupon_claims table has UNIQUE(user_id, campaign_id). Even if Valkey fails, restarts, or loses data, the database rejects a duplicate insert with error code 23505.

Race Condition Analysis

What If Valkey Restarts Between Claim and DB Write?

This is the dangerous window. If Valkey restarts after the Lua script succeeds but before the PostgreSQL insert completes:

Valkey loses the coupon:user:U1:C1 key
A retry sends another claim request
Valkey SET NX succeeds (key was lost)
A second code is popped from the pool
The INSERT into coupon_claims is attempted
PostgreSQL UNIQUE constraint rejects the insert with error 23505
The second coupon code is returned to the pool

This is why the two-layer defense is essential. Valkey handles the hot path; PostgreSQL is the safety net.

typescript

// Race condition recovery handler
async function handleDuplicateClaimAtDB(
  userId: string,
  campaignId: string,
  codeToReturn: string
): Promise<void> {
  // Return the code to the pool since this user already has one
  await valkey.rpush(`coupon:pool:${campaignId}`, codeToReturn);

  // Re-set the Valkey guard key (it was lost during restart)
  const existingClaim = await db.query(
    `SELECT code FROM coupon_claims WHERE user_id = $1 AND campaign_id = $2`,
    [userId, campaignId]
  );

  if (existingClaim.rows.length > 0) {
    await valkey.set(
      `coupon:user:${userId}:${campaignId}`,
      existingClaim.rows[0].code,
      'EX',
      86400
    );
  }
}

Distributed Uniqueness Across Multiple Services

In a microservices architecture, the coupon claim might be called from the Coupon Service directly (user clicks "Claim"), the Checkout Service (auto-claim during checkout), or a batch job (promotional distribution). All paths must go through the same Valkey + PostgreSQL enforcement:

Rule: No service bypasses the Coupon Service. Even internal batch jobs call the same claimCoupon() function. This single-writer pattern prevents enforcement gaps.

Additional Safeguards

Safeguard	Purpose
User ID from JWT (server-side)	Prevent client-side user ID spoofing
Rate limiting (5 claims/min per user)	Slow down automated attempts
Device fingerprint logging	Detect multi-account abuse post-hoc
IP-based throttling	10 claims/min per IP address
Claim audit log (Kafka)	Full audit trail for fraud investigation

Coupon Enforcement Approach Comparison

Approach	One-Per-User Guarantee	Throughput	Failure Mode
PostgreSQL UNIQUE constraint only	Strong (after commit)	~3,000 ops/sec	Slow under contention
Application-level HashMap	None (race conditions)	High	Double-claims
Valkey SET NX (user:coupon:campaign)	Strong (atomic)	~200,000 ops/sec	Lost on Valkey restart
Valkey SET NX + PostgreSQL UNIQUE	Strong (two layers)	~100,000 ops/sec	Durable + fast
Bloom Filter pre-check + DB	Probabilistic pre-check	Very high	False positives (safe direction)

Coupon Types and Stacking Rules

With claiming solved, the next challenge is applying coupons correctly at checkout. A real coupon system supports multiple discount types, each with its own calculation logic and validation rules.

Coupon Type Definitions

Type	Code	How It Works	Example
Percent Off	`percent_off`	Deducts a percentage of the subtotal, up to a max cap	15% off, max $100 discount
Fixed Amount	`fixed_amount`	Deducts a fixed dollar amount from the subtotal	$25 off
Buy One Get One	`bogo`	Adds a free item (cheapest item free, or specific SKU)	Buy 1 get 1 free
Free Shipping	`free_shipping`	Waives the shipping fee	Free standard shipping

Discount Calculation Logic

typescript

// coupon-calculator.ts

interface CartItem {
  skuId: string;
  name: string;
  price: number;
  quantity: number;
  category: string;
}

interface AppliedCoupon {
  code: string;
  type: 'percent_off' | 'fixed_amount' | 'bogo' | 'free_shipping';
  discountValue: number;
  maxDiscountCap: number | null;
  minCartValue: number;
  applicableSkus: string[] | null;
}

interface DiscountResult {
  couponCode: string;
  type: string;
  discountAmount: number;
  description: string;
}

function calculateDiscount(
  items: CartItem[],
  coupon: AppliedCoupon,
  shippingCost: number
): DiscountResult {
  const subtotal = items.reduce((sum, item) => sum + item.price * item.quantity, 0);

  // Validate minimum cart value
  if (subtotal < coupon.minCartValue) {
    return {
      couponCode: coupon.code,
      type: coupon.type,
      discountAmount: 0,
      description: `Cart minimum $${coupon.minCartValue} not met`
    };
  }

  // Filter applicable items
  const applicableItems = coupon.applicableSkus
    ? items.filter(item => coupon.applicableSkus!.includes(item.skuId))
    : items;

  const applicableSubtotal = applicableItems.reduce(
    (sum, item) => sum + item.price * item.quantity, 0
  );

  switch (coupon.type) {
    case 'percent_off': {
      let discount = applicableSubtotal * (coupon.discountValue / 100);
      if (coupon.maxDiscountCap !== null) {
        discount = Math.min(discount, coupon.maxDiscountCap);
      }
      return {
        couponCode: coupon.code,
        type: 'percent_off',
        discountAmount: Math.round(discount * 100) / 100,
        description: `${coupon.discountValue}% off${coupon.maxDiscountCap ? ` (max $${coupon.maxDiscountCap})` : ''}`
      };
    }

    case 'fixed_amount': {
      const discount = Math.min(coupon.discountValue, applicableSubtotal);
      return {
        couponCode: coupon.code,
        type: 'fixed_amount',
        discountAmount: Math.round(discount * 100) / 100,
        description: `$${coupon.discountValue} off`
      };
    }

    case 'bogo': {
      // Cheapest item free
      if (applicableItems.length < 2) {
        return {
          couponCode: coupon.code,
          type: 'bogo',
          discountAmount: 0,
          description: 'Add at least 2 eligible items for BOGO'
        };
      }
      const cheapest = [...applicableItems].sort((a, b) => a.price - b.price)[0];
      return {
        couponCode: coupon.code,
        type: 'bogo',
        discountAmount: cheapest.price,
        description: `Free: ${cheapest.name} ($${cheapest.price})`
      };
    }

    case 'free_shipping': {
      return {
        couponCode: coupon.code,
        type: 'free_shipping',
        discountAmount: shippingCost,
        description: 'Free shipping'
      };
    }
  }
}

Stacking Rules

Most flash sales allow limited stacking. The rules engine determines which combinations are valid.

Rule	Description
Max stackable coupons	2 (one discount coupon + one free shipping)
Percent + Fixed	NOT stackable (only one discount type)
Percent + Free Shipping	Stackable
Fixed + Free Shipping	Stackable
BOGO + anything	NOT stackable
Stacking priority	Free shipping applied last (after discount coupons)
Maximum total discount	Cannot exceed 60% of subtotal
Minimum final price	Order total must be >= $1.00 after all discounts

Stacking Validation Engine

typescript

// stacking-rules.ts

interface StackingValidation {
  valid: boolean;
  reason?: string;
  appliedCoupons: AppliedCoupon[];
  totalDiscount: number;
  finalTotal: number;
}

const DISCOUNT_TYPES = new Set(['percent_off', 'fixed_amount', 'bogo']);
const MAX_DISCOUNT_PERCENT = 0.60;  // 60% max
const MIN_FINAL_PRICE = 1.00;
const MAX_STACKED_COUPONS = 2;

function validateStacking(
  coupons: AppliedCoupon[],
  subtotal: number,
  shippingCost: number
): StackingValidation {
  // Rule 1: Max stackable count
  if (coupons.length > MAX_STACKED_COUPONS) {
    return {
      valid: false,
      reason: `Maximum ${MAX_STACKED_COUPONS} coupons can be combined`,
      appliedCoupons: [],
      totalDiscount: 0,
      finalTotal: subtotal + shippingCost
    };
  }

  // Rule 2: Only one discount-type coupon allowed
  const discountCoupons = coupons.filter(c => DISCOUNT_TYPES.has(c.type));
  if (discountCoupons.length > 1) {
    return {
      valid: false,
      reason: 'Only one discount coupon can be applied per order',
      appliedCoupons: [],
      totalDiscount: 0,
      finalTotal: subtotal + shippingCost
    };
  }

  // Rule 3: BOGO cannot stack with anything
  const hasBogo = coupons.some(c => c.type === 'bogo');
  if (hasBogo && coupons.length > 1) {
    return {
      valid: false,
      reason: 'Buy One Get One coupons cannot be combined with other offers',
      appliedCoupons: [],
      totalDiscount: 0,
      finalTotal: subtotal + shippingCost
    };
  }

  // Rule 4: Check each coupon is individually stackable (except if only one)
  if (coupons.length > 1) {
    for (const coupon of coupons) {
      if (!coupon.stackable) {
        return {
          valid: false,
          reason: `Coupon ${coupon.code} cannot be combined with other offers`,
          appliedCoupons: [],
          totalDiscount: 0,
          finalTotal: subtotal + shippingCost
        };
      }
    }
  }

  // Sort by stacking priority (lower number = applied first)
  const sorted = [...coupons].sort((a, b) => a.stackingPriority - b.stackingPriority);

  // Calculate total discount
  let totalDiscount = 0;
  const results: DiscountResult[] = [];
  let remainingSubtotal = subtotal;
  let remainingShipping = shippingCost;

  for (const coupon of sorted) {
    if (coupon.type === 'free_shipping') {
      totalDiscount += remainingShipping;
      remainingShipping = 0;
    } else {
      const result = calculateSingleDiscount(coupon, remainingSubtotal);
      totalDiscount += result.amount;
      remainingSubtotal -= result.amount;
    }
  }

  // Rule 5: Max discount percentage
  const discountPercent = totalDiscount / (subtotal + shippingCost);
  if (discountPercent > MAX_DISCOUNT_PERCENT) {
    const cappedDiscount = Math.floor((subtotal + shippingCost) * MAX_DISCOUNT_PERCENT * 100) / 100;
    return {
      valid: true,
      reason: `Discount capped at ${MAX_DISCOUNT_PERCENT * 100}% of order total`,
      appliedCoupons: sorted,
      totalDiscount: cappedDiscount,
      finalTotal: (subtotal + shippingCost) - cappedDiscount
    };
  }

  // Rule 6: Minimum final price
  const finalTotal = (subtotal + shippingCost) - totalDiscount;
  if (finalTotal < MIN_FINAL_PRICE) {
    const adjustedDiscount = (subtotal + shippingCost) - MIN_FINAL_PRICE;
    return {
      valid: true,
      reason: `Minimum order total is $${MIN_FINAL_PRICE}`,
      appliedCoupons: sorted,
      totalDiscount: adjustedDiscount,
      finalTotal: MIN_FINAL_PRICE
    };
  }

  return {
    valid: true,
    appliedCoupons: sorted,
    totalDiscount,
    finalTotal
  };
}

Calculation Examples

Example 1: 15% off coupon on a $649.99 laptop (max cap $100)

Step	Value
Subtotal	$649.99
Discount (15% of $649.99)	$97.50
Cap check ($97.50 < $100)	Under cap
Shipping	$9.99
Total	$649.99 - $97.50 + $9.99 = $562.48

Example 2: 15% off coupon + Free shipping stacked

Step	Value
Subtotal	$649.99
Percent discount (applied first)	$97.50
Free shipping discount (applied second)	$9.99
Total discount	$107.49
Discount % of order (107.49/659.98)	16.3% (under 60% cap)
Final total	$649.99 - $97.50 + $0.00 = $552.49

Example 3: $25 fixed coupon on $30 subtotal

Step	Value
Subtotal	$30.00
Fixed discount	$25.00
Shipping	$9.99
Final total	$30.00 - $25.00 + $9.99 = $14.99
Min price check ($14.99 >= $1.00)	Pass
Final total	$14.99

Example 4: Attempted BOGO + percent off (rejected)

Step	Result
Coupons	BOGO + 15% off
Stacking check	BOGO present + count > 1
Result	Rejected: BOGO cannot stack

Coupon Type Decision Flowchart

Checkout Saga with Coupon Rollback

A single database transaction cannot span Valkey, a payment gateway, and PostgreSQL. These are separate systems with separate failure modes. This means partial failure is inevitable: inventory can be reserved in Valkey while the payment gateway is down. The system must explicitly undo completed steps when a later step fails. This is the saga pattern.

Problem: A user checks out with a reserved item and an applied coupon. The payment is declined. The inventory and coupon must be returned immediately -- otherwise they are permanently consumed by a failed order.

Simple example: A restaurant reservation with a deposit. If the customer cancels, the table is released and the deposit is refunded. If only the table is released but not the deposit, money is lost. If only the deposit is refunded but not the table, capacity is wasted.

Mental model: Each saga step is a domino. Push them forward for the happy path. If one falls sideways (fails), pick up the fallen dominoes in reverse order.

Saga Steps

Step	Action	Compensating Action
1	Reserve inventory (Valkey DECRBY)	Release inventory (Valkey INCRBY)
2	Apply coupon (mark as 'applied')	Return coupon (mark as 'claimed', or return to pool)
3	Process payment (charge card)	Void/refund payment
4	Confirm order (PostgreSQL insert)	Cancel order (mark as 'cancelled')
5	Emit events (Kafka)	Emit compensation events

Temporal Workflow

typescript

// checkout-workflow.ts (Temporal)
import { proxyActivities, sleep, ApplicationFailure } from '@temporalio/workflow';

interface CheckoutInput {
  orderId: string;
  userId: string;
  saleId: string;
  skuId: string;
  quantity: number;
  couponCode: string | null;
  paymentToken: string;
  paymentMethod: string;
}

interface CheckoutResult {
  orderId: string;
  status: 'confirmed' | 'failed';
  failureReason?: string;
}

const inventory = proxyActivities<typeof import('./activities/inventory')>({
  startToCloseTimeout: '10s',
  retry: { maximumAttempts: 3 }
});

const coupons = proxyActivities<typeof import('./activities/coupons')>({
  startToCloseTimeout: '10s',
  retry: { maximumAttempts: 3 }
});

const payments = proxyActivities<typeof import('./activities/payments')>({
  startToCloseTimeout: '30s',  // payment gateways can be slow
  retry: { maximumAttempts: 2 }
});

const orders = proxyActivities<typeof import('./activities/orders')>({
  startToCloseTimeout: '10s',
  retry: { maximumAttempts: 3 }
});

export async function checkoutWorkflow(input: CheckoutInput): Promise<CheckoutResult> {
  let inventoryReserved = false;
  let couponApplied = false;
  let paymentId: string | null = null;

  try {
    // Step 1: Reserve Inventory
    const reserveResult = await inventory.reserveInventory(
      input.saleId, input.skuId, input.quantity
    );
    if (reserveResult.status !== 'success') {
      return {
        orderId: input.orderId,
        status: 'failed',
        failureReason: `Inventory unavailable: ${reserveResult.status}`
      };
    }
    inventoryReserved = true;

    // Step 2: Apply Coupon (if present)
    if (input.couponCode) {
      const couponResult = await coupons.applyCoupon(
        input.userId, input.couponCode, input.orderId
      );
      if (couponResult.status !== 'success') {
        throw ApplicationFailure.nonRetryable(
          `Coupon application failed: ${couponResult.reason}`
        );
      }
      couponApplied = true;
    }

    // Step 3: Process Payment
    const paymentResult = await payments.processPayment({
      orderId: input.orderId,
      userId: input.userId,
      amount: reserveResult.totalAmount,
      couponDiscount: couponApplied ? reserveResult.couponDiscount : 0,
      paymentToken: input.paymentToken,
      paymentMethod: input.paymentMethod
    });

    if (paymentResult.status !== 'captured') {
      throw ApplicationFailure.nonRetryable(
        `Payment failed: ${paymentResult.reason}`
      );
    }
    paymentId = paymentResult.paymentId;

    // Step 4: Confirm Order
    await orders.confirmOrder(input.orderId, paymentId);

    // Step 5: Emit Success Events
    await orders.emitOrderConfirmed(input.orderId);

    return { orderId: input.orderId, status: 'confirmed' };

  } catch (error: any) {
    // COMPENSATING TRANSACTIONS (reverse order)

    // Compensate Step 3: Void payment if charged
    if (paymentId) {
      try {
        await payments.voidPayment(paymentId);
      } catch (voidErr) {
        // Log for manual intervention; do not throw
        await orders.flagForManualReview(input.orderId, 'payment_void_failed');
      }
    }

    // Compensate Step 2: Return coupon
    if (couponApplied && input.couponCode) {
      try {
        await coupons.returnCoupon(input.userId, input.couponCode);
      } catch (couponErr) {
        await orders.flagForManualReview(input.orderId, 'coupon_return_failed');
      }
    }

    // Compensate Step 1: Release inventory
    if (inventoryReserved) {
      try {
        await inventory.releaseInventory(input.saleId, input.skuId, input.quantity);
      } catch (invErr) {
        await orders.flagForManualReview(input.orderId, 'inventory_release_failed');
      }
    }

    // Mark order as failed
    await orders.failOrder(input.orderId, error.message);

    // Emit failure event
    await orders.emitOrderFailed(input.orderId, error.message);

    return {
      orderId: input.orderId,
      status: 'failed',
      failureReason: error.message
    };
  }
}

Coupon Rollback Activity

typescript

// activities/coupons.ts

export async function returnCoupon(userId: string, couponCode: string): Promise<void> {
  // Step 1: Get campaign info from the coupon code
  const couponInfo = await db.query(
    `SELECT cc.campaign_id, cc.id as code_id
     FROM coupon_codes cc
     WHERE cc.code = $1 AND cc.claimed_by = $2`,
    [couponCode, userId]
  );

  if (couponInfo.rows.length === 0) {
    throw new Error(`Coupon ${couponCode} not found for user ${userId}`);
  }

  const { campaign_id: campaignId, code_id: codeId } = couponInfo.rows[0];

  // Step 2: Return code to Valkey pool
  await valkey.evalsha(couponReturnSha, 3,
    `coupon:pool:${campaignId}`,
    `coupon:claimed:${campaignId}`,
    `coupon:user:${userId}:${campaignId}`,
    userId,
    couponCode
  );

  // Step 3: Update PostgreSQL
  await db.query('BEGIN');
  try {
    // Mark the claim as rolled back
    await db.query(
      `UPDATE coupon_claims
       SET status = 'rolled_back', rolled_back_at = now()
       WHERE user_id = $1 AND campaign_id = $2`,
      [userId, campaignId]
    );

    // Mark the code as available again
    await db.query(
      `UPDATE coupon_codes
       SET status = 'returned', claimed_by = NULL, claimed_at = NULL
       WHERE id = $1`,
      [codeId]
    );

    // Decrement campaign claimed count
    await db.query(
      `UPDATE coupon_campaigns
       SET claimed_count = claimed_count - 1,
           status = CASE WHEN status = 'exhausted' THEN 'active' ELSE status END
       WHERE id = $1`,
      [campaignId]
    );

    await db.query('COMMIT');
  } catch (err) {
    await db.query('ROLLBACK');
    throw err;
  }

  console.log(`Coupon ${couponCode} returned to pool for campaign ${campaignId}`);
}

Why Temporal Over Manual Saga

Aspect	Manual Saga (Event-Driven)	Temporal Workflow
Compensation logic	Scattered across event handlers	Co-located in try/catch
Retry policy	Custom per-service	Declarative per-activity
Visibility	Grep logs across services	Temporal UI shows workflow state
Stuck workflows	Manual detection	Automatic timeout + alerting
Replay/debug	Near impossible	Temporal replay from event history
Idempotency	Must implement manually	Built-in deduplication

Bottlenecks and Mitigations

Bottleneck 1: Valkey Hot Key for Popular SKUs

Problem: A doorbuster item (e.g., $99 laptop, 50 units) will concentrate all reads and writes on a single Valkey key. In cluster mode, that key lives on one shard, creating asymmetric load.

Mitigation:

Shard inventory across 16 virtual slots per hot SKU (see Atomic Inventory Control section)
Use Valkey client-side caching for read-heavy inventory checks
Pre-sort users by SKU interest during queue admission to spread load

Impact if unmitigated: Single shard saturates at 200K ops/sec. With 10M users all checking the same doorbuster, the system needs 50K ops/sec on one key. Manageable with a single shard, but combined with other operations, sharding provides safety margin.

Bottleneck 2: Payment Gateway Latency

Problem: Payment gateways (Stripe, Adyen) have p99 latencies of 2-5 seconds. During a flash sale with 833 checkouts/sec, long-running payment calls consume thread pool capacity.

Mitigation:

Async payment processing via Temporal (non-blocking)
Payment timeout of 30 seconds with automatic retry (once)
Pre-authorize cards before sale starts (for registered users)
Circuit breaker on payment gateway calls (trip at 20% error rate)

Impact if unmitigated: 833 concurrent payment calls x 3 seconds average = 2,500 in-flight requests. Without async processing, this exhausts connection pools and cascades failures to inventory and coupon services.

Bottleneck 3: Coupon Pool Exhaustion Race

Problem: When the coupon pool is nearly empty (last 100 codes), thousands of users race to claim simultaneously. The LPOP is atomic, but the losers still consume a round-trip to Valkey.

Mitigation:

Track pool size with LLEN and expose "low inventory" indicator in UI
When pool drops below 1% remaining, add a client-side "lottery" step: only 1 in 10 requests actually calls the claim endpoint
Pre-announce "coupons limited" messaging to set expectations
Return a waitlist option when pool is exhausted

Impact if unmitigated: 100K users hitting the claim endpoint for the last 50 codes generates 100K unnecessary Valkey round-trips, adding latency to other operations on the same shard.

Component-Level Failure Modeling

Each bottleneck above addresses performance limits. Failure modeling addresses what happens when components stop working entirely. Every component in this system can fail, and each failure has a different blast radius.

Scenario 1: Payment Timeout with Inventory Reserved

Situation: A payment call takes 45 seconds (exceeding the 30-second timeout). Inventory is reserved but not confirmed. The Temporal workflow times out the payment activity.

Handling:

Temporal retries the payment check using the idempotency key
If payment was actually captured, the workflow proceeds to confirm
If payment was not captured, full compensation runs
Inventory hold expires after 10 minutes regardless (safety net)

Scenario 2: Valkey Shard Failure During Sale

Situation: The Valkey shard holding inventory keys for 30% of SKUs goes down. The replica promotes, but there is a 5-second gap.

Handling:

Time	Event	System Response
T+0s	Primary shard fails	Valkey Sentinel detects failure
T+1s	Sentinel starts election	Checkout requests for affected SKUs get connection errors
T+2s	Replica promoted	Write availability restored
T+3s	Clients reconnect	Inventory ops resume
T+0 to T+3s	Checkout requests fail	Checkout service returns 503 with "Retry-After: 5" header
T+3s+	Queue admission paused	Admission controller detects error spike, pauses admission
T+5s	Health check passes	Admission resumes at 50% rate, ramps back to 100%

Data consistency check:

Valkey replication is asynchronous. The promoted replica might be 1-2 ops behind.
Post-failover reconciliation job compares Valkey inventory counts with PostgreSQL ledger
Any mismatch triggers an alert and manual review before resuming sales

Scenario 3: Coupon Service Down Mid-Checkout

Situation: The Coupon Service crashes after inventory is reserved but before the coupon is applied. Temporal retries the coupon application activity.

Handling:

Key point: The user does not lose their coupon. If the coupon was already applied in Valkey before the crash, the retry will see "already applied" and proceed. If the coupon was never applied, the retry will apply it fresh. The combination of idempotent operations and Temporal's retry logic handles this gracefully.

Scenario 4: PostgreSQL Primary Failure

Situation: The PostgreSQL primary fails during an active sale. Failover to a read replica takes approximately 30 seconds.

Impact: Checkout writes fail for 30 seconds. Order creation and coupon claim persistence cannot proceed. However, Valkey continues serving inventory atomics and coupon claims at the Valkey layer -- the fast path remains operational.

Fallback: Orders queue in Kafka. The Temporal workflows pause at the "Confirm Order" step, holding inventory and coupon reservations. Once the replica is promoted and writes resume, Temporal retries the order confirmation activity.

Recovery: Promote the read replica to primary. Replay any Kafka events that were produced during the outage window but not yet consumed by the inventory-sync consumer.

Tradeoff: Up to 30 seconds of order creation failures. Users see "Order processing" status rather than immediate confirmation. No data is lost because Temporal persists workflow state and Kafka retains events.

Scenario 5: Kafka Broker Failure

Situation: One of three Kafka brokers goes down during the sale.

Impact: Partition leadership transfers in ~5 seconds. During the transfer, events targeting partitions led by the failed broker experience a latency spike. No data is lost because the replication factor is 3 and the ISR minimum is 2.

Recovery: Automatic leader election completes in ~5 seconds. Consumer groups rebalance, briefly pausing consumption. The inventory-sync consumer resumes from its committed offset.

Tradeoff: Brief consumer rebalance causes a ~5 second latency spike for inventory sync and analytics. Order processing via Temporal is unaffected because Temporal has its own persistence layer.

Scenario 6: CDN Failure

Situation: CloudFront experiences a regional outage. The origin receives the full 10M user load.

Impact: The origin servers are overloaded within seconds. The API Gateway, Valkey, and PostgreSQL are overwhelmed by traffic that should have been served from CDN.

Fallback: DNS failover to a secondary CDN provider (Fastly) triggers within 60 seconds based on health check failure.

Recovery: CDN self-heals or manual cache purge restores service. The secondary CDN serves from its own edge caches, which were warm because of synthetic traffic pre-warming.

Tradeoff: A 60-second DNS failover window where the sale is effectively down. Mitigation: use dual-CDN configuration with active-active health checks and shorter DNS TTLs (30s) during sale windows.

Scenario 7: API Gateway Failure

Situation: An API Gateway pod crashes during the sale.

Impact: Kubernetes restarts the pod in ~10 seconds. In-flight requests on that pod receive connection resets. Other gateway pods continue serving traffic via the load balancer.

Recovery: Automatic. The load balancer detects the failed health check and routes traffic to healthy pods. The replacement pod starts serving within 10 seconds.

Tradeoff: Brief connection reset for in-flight requests routed to the failed pod. Clients should retry with exponential backoff.

Scenario 8: Temporal Failure

Situation: The Temporal server goes down during an active sale with in-flight checkout workflows.

Impact: Saga workflows pause. Inventory and coupons remain reserved but unconfirmed. No new checkouts can start.

Recovery: Temporal has persistent storage (PostgreSQL or Cassandra). On restart, all in-flight workflows resume from their last checkpoint. No workflow step is repeated because Temporal's event history tracks which activities completed.

Tradeoff: Checkout latency spike during the outage. Users see "Order processing" for an extended period. No data loss or double-processing occurs.

Scenario 9: Payment Gateway Timeout

Situation: The payment gateway starts timing out consistently.

Impact: Circuit breaker opens after 5 consecutive timeouts. The checkout service stops sending new payment requests.

Fallback: Return "payment pending" status to users. Enqueue payment retries via a background job that probes the gateway every 30 seconds.

Recovery: Close the circuit breaker after 3 successful probe responses. Resume normal payment processing.

Tradeoff: Delayed order confirmation. Users experience anxiety during the "payment pending" state. Inventory remains reserved (10-minute timeout protects against permanent lock).

Deployment and Operations

Pre-Sale Preparation (T-24 Hours)

Task	Details
Load test	Simulate 10M users with k6/Gatling hitting queue join, coupon claim, and checkout
CDN cache warming	Deploy sale page, waiting room, and all static assets to CDN edge nodes
Valkey pre-loading	Load inventory counters and coupon pools into Valkey
Database vacuuming	Run VACUUM ANALYZE on all tables involved in the sale
Connection pool warming	Pre-establish DB and Valkey connection pools
Runbook review	Ensure on-call team has step-by-step for all failure scenarios

Deployment Architecture

Auto-Scaling Configuration

yaml

# HPA for checkout service
apiVersion: autoscaling/v2
kind: HorizontalPodAutoscaler
metadata:
  name: checkout-service-hpa
spec:
  scaleTargetRef:
    apiVersion: apps/v1
    kind: Deployment
    name: checkout-service
  minReplicas: 10      # Pre-scaled for sale
  maxReplicas: 50
  metrics:
    - type: Pods
      pods:
        metric:
          name: http_requests_per_second
        target:
          type: AverageValue
          averageValue: "100"   # Scale up when > 100 req/sec per pod
    - type: Pods
      pods:
        metric:
          name: temporal_workflow_queue_depth
        target:
          type: AverageValue
          averageValue: "50"    # Scale up when workflow queue backs up
  behavior:
    scaleUp:
      stabilizationWindowSeconds: 30
      policies:
        - type: Pods
          value: 10
          periodSeconds: 60
    scaleDown:
      stabilizationWindowSeconds: 300   # Wait 5 min before scaling down

Rollback Plan

Trigger	Action
Checkout error rate > 5% for 2 min	Pause queue admission, investigate
Checkout error rate > 20% for 1 min	Pause sale, show maintenance page
Valkey cluster unhealthy	Failover to replica, pause admission during promotion
Payment gateway down	Queue orders for retry, show "order processing" status
Database CPU > 90%	Enable read replicas for all read queries, reject new checkouts

Feature Flags

Feature flags allow runtime configuration changes without redeployment. During an active sale, toggling a flag takes effect within seconds via the configuration service.

Flag	Description	Default
`sale_queue_enabled`	Disable the virtual queue for small sales with < 10K expected users	`true`
`coupon_stacking_enabled`	Toggle stacking rules; when disabled, only one coupon per order	`true`
`sharded_inventory`	Enable/disable 16-slot inventory sharding for doorbuster SKUs	`true`
`payment_circuit_breaker_threshold`	Number of consecutive timeouts before the circuit breaker opens	`5`

Database Migrations

Pre-sale (T-48h): All schema migrations run 48 hours before the sale. Migrations are tested against a production-snapshot database to verify execution time and lock duration.

Zero-downtime approach: Nullable columns are added first, code is deployed to write to both old and new columns, backfill runs as a background job, then the NOT NULL constraint is added after all rows have values.

coupon_claims table: Uses online DDL via pg_repack for structural changes. The UNIQUE(user_id, campaign_id) constraint is never modified during a sale.

Post-sale: Completed sale partitions in the orders table are detached from the partitioned table (sub-second exclusive lock) and moved to cold storage. This keeps the hot partition set small for the next sale.

Rollback

Code rollback: kubectl rollout undo deployment/checkout-service completes in under 30 seconds. Kubernetes performs a rolling update, draining connections from old pods before terminating them.

During active sale: Rollback requires a sale pause. The admission controller stops new checkouts, in-flight Temporal sagas complete (or compensate), then the rollback proceeds. This prevents half-old/half-new code from processing the same orders.

Valkey state is not rolled back. Inventory counts, coupon pool state, and queue positions are forward-only. If a code change corrupted Valkey state, the fix is forward -- a reconciliation job compares Valkey with PostgreSQL and adjusts counts. Rolling back Valkey to a previous point-in-time risks overselling or double-claiming.

Observability

The deployment strategy gets code into production. Observability determines whether the running system is behaving correctly under sale load.

Sale-Specific Dashboards

Dashboard	Key Metrics
Sale Overview	Orders/min, revenue/min, unique buyers, conversion rate
Inventory Tracker	Units remaining per SKU, sold-out SKUs, reserve-to-confirm ratio, depletion rate per SKU (%/min)
Coupon Dashboard	Claims/min, pool remaining per campaign, claim success %, redemption rate, rollback count
Queue Monitor	Queue depth, admission rate, queue drain rate, avg wait time, dropout rate
Payment Health	Success rate, avg latency, timeout count, gateway error breakdown
Infrastructure	Valkey ops/sec, DB connections, Kafka consumer lag, pod count, GC pause duration (p99)

Critical Alerts

yaml

# Prometheus alerting rules
groups:
  - name: flash-sale-alerts
    rules:
      - alert: InventoryOversold
        expr: |
          sale_inventory_sold_total > sale_inventory_total_quantity
        for: 0s   # Immediate alert
        labels:
          severity: critical
        annotations:
          summary: "CRITICAL: Inventory oversold for SKU {{ $labels.sku_id }}"

      - alert: CouponDoubleClaimDetected
        expr: |
          rate(coupon_claim_duplicate_total[1m]) > 0
        for: 0s
        labels:
          severity: critical
        annotations:
          summary: "Duplicate coupon claim detected for campaign {{ $labels.campaign_id }}"

      - alert: CheckoutErrorRateHigh
        expr: |
          rate(checkout_errors_total[2m]) / rate(checkout_requests_total[2m]) > 0.05
        for: 2m
        labels:
          severity: warning
        annotations:
          summary: "Checkout error rate above 5%"

      - alert: ValkeyLatencyHigh
        expr: |
          histogram_quantile(0.99, rate(valkey_command_duration_seconds_bucket[1m])) > 0.01
        for: 1m
        labels:
          severity: warning
        annotations:
          summary: "Valkey p99 latency above 10ms"

      - alert: CouponPoolNearlyExhausted
        expr: |
          coupon_pool_remaining / coupon_pool_total < 0.05
        for: 0s
        labels:
          severity: info
        annotations:
          summary: "Coupon pool {{ $labels.campaign_id }} below 5% remaining"

      - alert: QueueWaitTimeExcessive
        expr: |
          queue_estimated_wait_seconds_p99 > 300
        for: 2m
        labels:
          severity: warning
        annotations:
          summary: "Queue wait time p99 exceeds 5 minutes"

      - alert: PaymentSagaStuck
        expr: |
          temporal_workflow_running_duration_seconds > 120
        for: 0s
        labels:
          severity: critical
        annotations:
          summary: "Checkout workflow running for over 2 minutes"

      - alert: KafkaConsumerLagHigh
        expr: |
          kafka_consumer_group_lag > 10000
        for: 5m
        labels:
          severity: warning
        annotations:
          summary: "Kafka consumer lag above 10K for group {{ $labels.consumer_group }}"

Structured Logging

Every service emits structured JSON logs with correlation IDs:

json

{
  "timestamp": "2026-07-01T12:05:30.123Z",
  "level": "INFO",
  "service": "checkout-service",
  "trace_id": "abc123def456",
  "user_id": "user_789",
  "order_id": "ord_def456",
  "sale_id": "sale_abc123",
  "sku_id": "SKU-LAPTOP-001",
  "action": "inventory_reserved",
  "duration_ms": 3,
  "valkey_ops": 1,
  "inventory_remaining": 42
}

Post-Sale Reconciliation

After the sale ends, a reconciliation job runs:

typescript

// reconciliation.ts
async function reconcileInventory(saleId: string): Promise<ReconciliationReport> {
  const skus = await db.query(
    `SELECT sku_id, total_quantity, sold_quantity, reserved_quantity
     FROM sale_inventory WHERE sale_id = $1`,
    [saleId]
  );

  const report: ReconciliationReport = { mismatches: [], total: 0 };

  for (const sku of skus.rows) {
    const valkeyAvail = parseInt(
      await valkey.get(`inv:${saleId}:${sku.sku_id}`) || '0'
    );
    const valkeyReserved = parseInt(
      await valkey.get(`inv:reserved:${saleId}:${sku.sku_id}`) || '0'
    );

    const expectedAvail = sku.total_quantity - sku.sold_quantity - sku.reserved_quantity;

    if (valkeyAvail !== expectedAvail) {
      report.mismatches.push({
        skuId: sku.sku_id,
        valkeyAvailable: valkeyAvail,
        dbExpectedAvailable: expectedAvail,
        valkeyReserved,
        dbReserved: sku.reserved_quantity,
        delta: valkeyAvail - expectedAvail
      });
    }
    report.total++;
  }

  if (report.mismatches.length > 0) {
    await alertOncall('INVENTORY_MISMATCH', report);
  }

  return report;
}

Distributed Tracing

OpenTelemetry traces span the entire checkout path: queue-join, admission, checkout, inventory decrement, coupon claim, payment, and confirmation. Each trace captures the full saga lifecycle.

Key spans and latency budgets:

Span	p99 Target	Notes
`lua_inventory_decrement`	< 2ms	Valkey Lua eval; any spike indicates shard contention
`coupon_claim`	< 5ms	Valkey Lua eval + SET NX; higher than inventory due to two-step check
`payment_call`	< 3s	Payment gateway round-trip; high variance expected
`saga_total`	< 5s	End-to-end Temporal workflow; dominated by payment latency

Trace sampling: 100% during the sale window. Flash sales are short-duration, high-value events -- every request matters for debugging and post-sale analysis. Post-sale, sampling drops to 1% for background reconciliation and analytics traffic.

SLI/SLO

SLI	SLO	Measurement
Checkout p99 latency	< 5s	Temporal workflow duration histogram
Inventory accuracy	0 oversells	Valkey vs PostgreSQL reconciliation delta
Coupon uniqueness	0 double-claims	PostgreSQL UNIQUE constraint violation count
Queue fairness	FIFO within 1% deviation	Queue position audit: compare actual admission order with join timestamp order

Runbooks

inventory_count_mismatch: Pause admission. Run the reconciliation job manually. Compare the Valkey DECR log (from structured logging) with the PostgreSQL inventory ledger. Identify the divergence point. Manually adjust Valkey counts to match PostgreSQL if PostgreSQL is correct (it has the durable record). Resume admission at 50% rate, monitoring for further divergence.

coupon_pool_exhausted: Verify the pool size matches the expected campaign configuration. Check for leaked codes -- codes that were claimed in Valkey (SET NX succeeded) but never written to PostgreSQL (the INSERT failed or was dropped). Leaked codes reduce the effective pool size. Alert the product team to decide whether to generate additional codes.

payment_gateway_circuit_open: Check the payment gateway's status page. Verify the circuit breaker configuration (threshold, probe interval). If the gateway reports healthy, the issue may be network-related -- check DNS resolution and TLS handshake latency. If sustained for more than 5 minutes, escalate to the payment team and consider enabling a backup payment processor.

Security

Bot Prevention

Flash sales attract bots. Scalpers use automated tools to buy inventory before humans can react. A multi-layered defense is required.

Layer	Technique	Implementation
CDN Edge	Rate limiting	10 requests/sec per IP; burst of 20
CDN Edge	Geo-blocking	Block traffic from non-serviceable regions
WAF	Known bot signatures	Block Selenium, Puppeteer, PhantomJS UA strings
Queue Join	CAPTCHA challenge	hCaptcha or Cloudflare Turnstile before queue entry
Queue Join	Proof of Work	Client-side computation challenge (100ms solve time)
Checkout	Device fingerprint	FingerprintJS to identify same device across sessions
Checkout	Behavioral analysis	Mouse movement, scroll patterns, typing cadence
Post-Purchase	Velocity checks	Flag users who checkout in < 3 seconds after admission

CAPTCHA Integration

typescript

// queue-join-handler.ts
async function handleQueueJoin(req: Request): Promise<Response> {
  const { saleId, captchaToken } = req.body;
  const userId = req.user.id;

  // Step 1: Verify CAPTCHA
  const captchaValid = await verifyCaptcha(captchaToken);
  if (!captchaValid) {
    return new Response(JSON.stringify({
      error: 'CAPTCHA_FAILED',
      message: 'Please complete the CAPTCHA challenge'
    }), { status: 403 });
  }

  // Step 2: Check device fingerprint
  const fingerprint = req.headers.get('x-device-fingerprint');
  const existingEntry = await valkey.get(`queue:device:${fingerprint}:${saleId}`);
  if (existingEntry) {
    return new Response(JSON.stringify({
      error: 'DUPLICATE_DEVICE',
      message: 'This device is already in the queue'
    }), { status: 409 });
  }

  // Step 3: Rate limit by IP
  const ip = req.headers.get('x-forwarded-for');
  const ipCount = await valkey.incr(`queue:ip:${ip}:${saleId}`);
  await valkey.expire(`queue:ip:${ip}:${saleId}`, 60);

  if (ipCount > 5) {
    return new Response(JSON.stringify({
      error: 'RATE_LIMITED',
      message: 'Too many queue join attempts from this IP'
    }), { status: 429 });
  }

  // Step 4: Join queue
  const result = await joinQueue(saleId, userId);

  // Mark device as queued
  await valkey.set(`queue:device:${fingerprint}:${saleId}`, userId, 'EX', 7200);

  return new Response(JSON.stringify(result), { status: 200 });
}

Coupon Abuse Detection

Pattern	Detection	Response
Same user, multiple accounts	Device fingerprint + IP correlation	Flag accounts, require manual verification
Coupon code sharing	Track claim-to-redemption time (< 5s = suspicious)	Invalidate shared codes
Automated claiming	Request timing analysis (consistent sub-100ms intervals)	Temporary ban + CAPTCHA
Coupon farming	Multiple claims across campaigns from same device	Limit to 3 campaigns per device
Resale detection	Same shipping address across multiple user accounts	Flag for review

Abuse Scoring Engine

typescript

// abuse-scorer.ts
interface AbuseSignals {
  captchaScore: number;       // 0.0 to 1.0 (Turnstile risk score)
  deviceAge: number;          // seconds since first seen
  accountAge: number;         // seconds since registration
  requestInterval: number;    // ms between consecutive requests
  ipSharedAccounts: number;   // number of accounts from same IP
  claimVelocity: number;      // coupons claimed per minute
  fingerprintSharedAccounts: number;
}

function calculateAbuseScore(signals: AbuseSignals): number {
  let score = 0;

  if (signals.captchaScore < 0.3) score += 30;
  if (signals.deviceAge < 300) score += 15;         // device seen < 5 min ago
  if (signals.accountAge < 86400) score += 20;       // account < 1 day old
  if (signals.requestInterval < 100) score += 25;    // sub-100ms requests
  if (signals.ipSharedAccounts > 3) score += 20;     // many accounts from same IP
  if (signals.claimVelocity > 2) score += 25;        // > 2 coupons/min
  if (signals.fingerprintSharedAccounts > 2) score += 30;

  return Math.min(100, score);
}

// Score thresholds
// 0-30: Normal user, proceed
// 31-60: Suspicious, require additional CAPTCHA
// 61-80: High risk, delay queue admission by 30 seconds
// 81-100: Likely bot, block and log for review

Security Headers and API Protection

Protection	Implementation
API authentication	JWT with short expiry (15 min during sale)
Request signing	HMAC signature on checkout requests to prevent tampering
CORS	Strict origin whitelist (sale domain only)
Content Security Policy	Prevent XSS that could steal coupons
Rate limiting	Per-user, per-IP, and per-endpoint limits
Input validation	Strict schema validation on all API inputs
Idempotency keys	Required on checkout and coupon claim endpoints

typescript

// Idempotency middleware
async function idempotencyMiddleware(req: Request, res: Response, next: Function) {
  const idempotencyKey = req.headers.get('idempotency-key');
  if (!idempotencyKey) {
    return res.status(400).json({ error: 'Idempotency-Key header required' });
  }

  const cacheKey = `idempotency:${req.user.id}:${idempotencyKey}`;
  const cached = await valkey.get(cacheKey);

  if (cached) {
    // Return cached response
    const cachedResponse = JSON.parse(cached);
    return res.status(cachedResponse.status).json(cachedResponse.body);
  }

  // Store a lock while processing
  const locked = await valkey.set(cacheKey, 'processing', 'NX', 'EX', 300);
  if (!locked) {
    // Another request with the same key is in progress
    return res.status(409).json({ error: 'Request already in progress' });
  }

  // Override res.json to cache the response
  const originalJson = res.json.bind(res);
  res.json = (body: any) => {
    valkey.set(cacheKey, JSON.stringify({ status: res.statusCode, body }), 'EX', 300);
    return originalJson(body);
  };

  next();
}

Encryption

All client-to-server and inter-service traffic uses TLS 1.3. PostgreSQL data is encrypted at rest with AES-256-GCM. Valkey requires AUTH and TLS for all connections -- unencrypted Valkey traffic on the internal network is disabled to prevent sniffing of inventory state or coupon codes.

Payment data is encrypted with PCI-DSS-compliant key management via AWS KMS. Payment tokens are never stored in the application database -- only the payment gateway's tokenized reference (payment_id) is persisted. Coupon codes are hashed in logs using SHA-256 to prevent leakage through log aggregation systems.

PII Handling

User email and payment details are personally identifiable information (PII). These fields are stored only in PostgreSQL with column-level encryption. PII is never cached in Valkey (user identifiers in Valkey are UUIDs, not emails) and never included in log entries.

GDPR right-to-erasure: A cascade delete on user_id removes all associated records across the orders and coupon_claims tables. The deletion is logged for audit but the PII content is irrecoverable after deletion.

Data retention: Order and claim records are retained for 2 years to satisfy financial compliance requirements. After 2 years, a batch job purges records and associated PII. Anonymized aggregate data (sale totals, conversion rates) is retained indefinitely for analytics.

Testing and Validation

Security protects the system from external threats. Testing validates that the system behaves correctly under the specific conditions of a flash sale -- extreme concurrency, partial failures, and edge cases that only appear at scale.

Load Testing

k6 scripts simulate the full sale lifecycle: 10M virtual queue joins, 500K coupon claims, and 50K checkouts/min. Pass criteria: p99 checkout latency < 5s, 0 oversells, 0 double-claims. Load tests run T-48h before every sale using production-scale Valkey and PostgreSQL instances. The test environment mirrors production topology -- same number of Valkey shards, same PostgreSQL instance type, same Kafka partition count.

Load test scenarios:

Scenario	Target	Pass Criteria
Queue flood	10M joins in 60s	All joins acknowledged, queue depth = 10M
Sustained checkout	50K orders/min for 30 min	p99 < 5s, 0 oversells
Coupon claim burst	100K claims in 10s	0 double-claims, pool count accurate
Doorbuster contention	50K users targeting 1 SKU (50 units)	Exactly 50 sold, 0 oversells
Graceful degradation	Kill payment gateway mid-test	Queue pauses, in-flight sagas compensate

Chaos Engineering

LitmusChaos scenarios run in the staging environment T-72h before the sale:

Kill Valkey primary during sale: Verify replica promotion completes in < 5 seconds. Admission controller pauses during the failover window. Post-promotion reconciliation confirms 0 oversells. The test fails if any inventory count diverges between Valkey and PostgreSQL.

Kill Temporal worker: Verify in-flight saga workflows pause (not fail). On worker restart, workflows resume from the last completed activity. No inventory or coupon leaks.

Partition Kafka broker: Simulate a network partition isolating one of three brokers. Verify no event loss after the partition heals. Consumer groups rebalance and resume from committed offsets.

Inject 5s payment latency: All payment gateway calls take 5 seconds. Verify the circuit breaker activates at the configured threshold. Users receive "payment pending" status. After latency returns to normal, the circuit breaker closes and checkout resumes.

Integration Tests

End-to-end sale simulation runs on every deploy to staging: create sale, join queue, get admitted, add to cart, apply coupon, checkout, then verify: inventory decremented in both Valkey and PostgreSQL, coupon claimed and marked as redeemed, order created with correct amounts, payment record exists. The test covers the happy path and two failure paths (payment decline, coupon already claimed).

Contract Tests

API schemas are validated against an OpenAPI spec on every pull request. Breaking changes (removing fields, changing types) fail CI. Kafka event schemas are validated via an Avro schema registry with backward compatibility mode -- consumers written against schema v1 must be able to read events produced with schema v2. Temporal workflow schemas are validated via replay tests -- workflow code is replayed against historical event logs to verify determinism.

Data Validation

Post-sale reconciliation (detailed in Observability) validates five invariants:

Sum of sold inventory + remaining inventory = original total for every SKU
Every claimed coupon has a corresponding coupon_claims record
Every confirmed order has a payment record
No user has more than one coupon claim per campaign (UNIQUE constraint violation count = 0)
Kafka event count matches PostgreSQL order count (within the consumer lag window)

Any mismatch triggers a P1 alert and blocks the next sale until resolved.

Cost and Capacity

Testing validates correctness. The cost model below ensures the architecture is economically viable -- flash sale infrastructure is burst-oriented, and most cost accrues during the 1-4 hour sale window, not 24/7.

Per-Sale Cost Breakdown

Component	Per-Sale (4h)	Monthly (daily sales)	10x Scale
Valkey (6+6 nodes)	$50	$1,500	$8,000 (30+30 nodes)
PostgreSQL (db.r6g.2xl)	$30	$900	$5,000 (read replicas)
Kafka (3 brokers)	$20	$600	$3,000 (15 brokers)
Compute (API + workers)	$100	$3,000	$18,000
CDN (CloudFront)	$200	$6,000	$30,000
Temporal (3 nodes)	$40	$1,200	$6,000
Total	~$440/sale	~$13,200/mo	~$70,000/mo

Cost Cliff

CDN is the largest cost driver because 10M concurrent users generate massive egress serving sale pages, images, and static assets. At 10x scale (100M users), CDN egress reaches $30K/month. Mitigation: aggressive caching (60s TTL), WebP images (30-50% smaller than JPEG), and edge-side includes for dynamic inventory count badges so the full page is not re-fetched on inventory changes.

Optimization

Pre-provision Valkey and compute for the sale duration only. Kubernetes HPA with scheduled scaling rules: scale up T-1h before the sale, maintain high replica count during the sale, scale down T+2h after the sale ends. Analytics consumers run on spot instances (can tolerate interruption without data loss because Kafka retains events for 7 days).

Valkey nodes can use reserved instances for predictable daily-sale patterns. For infrequent sales (weekly or monthly), on-demand pricing is more cost-effective despite the higher per-hour rate.

Hidden Cost

Payment gateway fees (2.9% + $0.30 per transaction) dwarf infrastructure costs. At 50K orders per sale with an average order value of $50, the gross merchandise value is $2.5M. Payment fees: $2.5M x 2.9% + 50K x $0.30 = $72,500 + $15,000 = $87,500 per sale. This is 200x the infrastructure cost. Optimizing infrastructure spend is important, but negotiating payment gateway rates has a far larger impact on unit economics.

Multi-Region Considerations

The cost model assumes single-region deployment. This section addresses what changes for global flash sales and why single-region is the deliberate default.

Single-Region by Design

The current design is single-region. This is deliberate: flash sales are time-bounded events (1-4 hours) where inventory must be globally consistent. Multi-region replication introduces the risk of overselling during network partition events. For a 2-hour sale with 100K inventory, a 5-second replication lag across regions could result in hundreds of oversold items.

Global Flash Sale Options

Option A: Regional inventory partitioning. Allocate inventory by region: 200 units to US, 200 to EU, 100 to APAC. Each region operates independently with a local Valkey cluster. Tradeoff: unsold inventory in one region cannot be dynamically reallocated to another without cross-region coordination, which adds 100-200ms latency and reintroduces the consistency risk. A doorbuster that sells out in US while EU has remaining units creates a poor customer experience.

Option B: Single-region inventory with global CDN. Keep all inventory atomics in one region (US-East). CDN serves sale pages globally from edge locations. Queue join and checkout requests route to US-East regardless of user location. Tradeoff: EU and APAC users experience 100-200ms additional latency on checkout -- acceptable for a 5s p99 SLO that is dominated by payment gateway latency.

Recommended: Option B for most flash sales (simpler architecture, no split-brain risk). Option A only for mega-sales exceeding 1M inventory items and 100M concurrent users, where single-region write throughput becomes the bottleneck.

Data Distribution

Data	Strategy	Rationale
Sale config, product catalog	Replicate globally via CDN	Read-only during sale; staleness is acceptable
Queue	Global (single sorted set in primary region)	FIFO ordering requires single authority
Inventory	Global (single Valkey cluster in primary region)	Consistency requires single authority
Orders	Written in primary region, replicated async to regional read replicas	Read-after-write consistency in primary region; eventual consistency elsewhere

Regional Failover

Active-passive architecture. If the primary region fails during a sale, the sale is paused -- not failed over. Rationale: inventory consistency is more important than availability for a flash sale. Users see a "Sale paused, please wait" message rather than risk overselling. The sale resumes when the primary region recovers.

DNS failover TTL is 60 seconds. But during active sales, the queue WebSocket connection detects disconnection instantly and displays the pause message client-side without waiting for DNS propagation.

Real-World Evolution: Three Things That Break at 10x

The multi-region analysis reveals that scale changes the problem, not just the numbers. Three specific break points emerge as the system grows beyond the initial design target.

1. Single Valkey Cluster Hits Throughput Ceiling (100M Users)

At 100x scale, the 16-slot inventory sharding strategy is insufficient. 100M concurrent users generate 250K ops/sec on inventory keys alone. A single Valkey cluster handles this on paper (200K+ ops/sec per node), but combined with coupon pool operations, queue management, and session state, the cluster is saturated.

The path forward: client-side caching with server-assisted invalidation, available in Valkey 7.2+. The server tracks which keys each client has cached and pushes invalidation messages when values change. For inventory counts, this means 99% of "is this item still available?" reads are served from the client process without a Valkey round-trip. Only the atomic decrement operations hit Valkey. Alternatively, a multi-cluster topology with regional inventory pools (Option A from Multi-Region) becomes necessary at this scale, accepting the complexity of cross-region reallocation.

2. Coupon Stacking Rule Evaluation Becomes a Bottleneck (50+ Coupon Types)

The current stacking validation iterates all coupon pairs to check compatibility -- O(n^2) where n is the number of applied coupons. At 2-3 coupons per order, this is trivial. But as the marketing team introduces new coupon types (referral codes, loyalty multipliers, category-specific discounts, seasonal stacks), the compatibility matrix grows quadratically. At 50+ coupon types, the stacking validation adds measurable latency to every checkout.

The path forward: a pre-computed compatibility matrix. At coupon campaign creation time, the system evaluates all pairwise compatibility rules and stores the result in a lookup table. At checkout, stacking validation becomes O(n) -- look up each pair in the matrix. For more complex rules (e.g., "maximum 3 discounts where total does not exceed 60% and no two are from the same category"), a constraint solver replaces the brute-force iteration.

3. Saga Orchestration Latency Grows with Payment Provider Diversity

Adding new payment methods (cryptocurrency, buy-now-pay-later, regional payment networks) adds new saga steps and new compensation paths. Each payment method has different timeout characteristics, different idempotency guarantees, and different failure modes. The Temporal workflow accumulates conditional branches, making it harder to test and reason about.

The path forward: workflow versioning and shadow-mode testing. New payment methods are added as versioned workflow variants. Shadow mode runs the new workflow in parallel with the existing one (using a test payment sandbox) on 1% of traffic. Compensation paths for each payment method are tested independently via Temporal replay tests against recorded event histories.

Secondary Evolution Items

Soft reservation TTL and background reclamation. Production systems add a TTL to reserved inventory (5-10 minutes). A background job scans for expired reservations and returns them to the available pool. Without this, inventory leaks on every sale at the rate of user abandonment.

Token-bound access control. The access token is bound to userId + deviceFingerprint + expiry with a cryptographic signature, validated on every subsequent request. Without binding, tokens can be shared or replayed.

Pre-warming and dry runs. Production systems never start a flash sale cold. At T-1 hour: warm Valkey keys, load coupon pools, warm CDN edge caches, pre-establish database connection pools, and run synthetic traffic through the full checkout path.

Kill switches and circuit breakers. When the payment gateway returns errors at 20%+ rate, the system stops admitting new users to checkout, allows browsing only, and queues pending orders for retry.

Runtime memory pressure under burst traffic. Garbage-collected runtimes (JVM, Go, Node.js) face sharp allocation rate spikes during flash sales. Production systems tune GC parameters for burst workloads and monitor GC pause duration alongside p99 latency.

VIP and priority queues. Real flash sales sometimes offer early access to premium subscribers. The virtual queue becomes a priority queue with multiple tiers, adding business value at the cost of fairness complexity.

Explore the Technologies

Core Technologies

Technology	Role in This Design	Deep Dive
Valkey/Redis	Inventory atomics, coupon pools, queue management	Redis
PostgreSQL	Durable ledger for orders, coupon claims, reconciliation	PostgreSQL
Kafka	Event streaming for order events, analytics, async processing	Kafka
Prometheus	Metrics collection for sale-specific dashboards and alerting	Prometheus
Grafana	Dashboard visualization for real-time sale monitoring	Grafana
Kong	API gateway for per-route rate limiting, JWT auth, request routing	Kong

Infrastructure Patterns

Pattern	Relevance to This Design	Deep Dive
Message Queues and Event Streaming	Kafka for order events and async inventory sync	Event Streaming
Caching Strategies	CDN + Valkey multi-layer with inverted consistency model	Caching
Rate Limiting and Throttling	Per-user, per-IP, per-endpoint rate limits	Rate Limiting
Circuit Breaker and Resilience	Payment gateway circuit breaker, admission controller	Circuit Breaker
CDN and Edge Computing	CloudFront for sale page, waiting room, DDoS absorption	CDN
API Gateway	Kong for routing, auth, per-route rate limiting	API Gateway
Database Sharding	Valkey 16-slot inventory sharding, PG partition by sale_id	Sharding
Auto-Scaling Patterns	Kubernetes HPA for queue-driven and CPU-driven scaling	Auto-Scaling
WebSocket and Real-Time	Queue position updates via WebSocket push	WebSocket
Deployment Strategies	Canary 5% to full rollout with rollback triggers	Deployment
Alerting and On-Call	Sale-specific alert rules and escalation	Alerting
Metrics and Monitoring	6 sale dashboards, 40+ specific alerts	Metrics

External References

Temporal -- Saga orchestration for checkout workflows
Socket.io -- WebSocket library for real-time queue updates
FingerprintJS -- Device fingerprinting for bot detection
hCaptcha -- CAPTCHA for queue entry bot prevention
k6 -- Load testing for simulating 10M concurrent users

A flash sale system is not an inventory system. It is a system that controls contention. Every component exists for one reason: to decide who gets access, and who does not, under extreme pressure. The queue decides who enters. Valkey decides who succeeds. Temporal decides who gets compensated. The rest is implementation detail.

CrackingWalnuts