CrackingWalnuts

System DesignApril 10, 2026· 66 min read

System Design: Ad Exchange (Real-Time Bidding, Sub-100ms Auctions, DSP/SSP, Impression Serving)

Goal: An ad exchange that runs real-time bidding auctions for one million ad requests per second, picks the top 5 of 50+ registered DSPs per request, finishes a first-price auction inside 100ms, enforces publisher floor prices and supply-chain rules (ads.txt, sellers.json, schain), serves winning ad creatives through a CDN, tracks impressions and clicks on its own (rather than trusting DSP-reported numbers), and reconciles billions of dollars of ad spend each month between DSPs and publishers. At about $4B/month and 300K impressions/sec, that works out to roughly 25 billion impressions a day across three regional clusters in us-east, eu-west, and ap-south.

Reading guide: §1 walks through one ad serving end-to-end, with all the actors involved. §2–§4 cover the problem and requirements. §5 introduces the ecosystem and the components inside the exchange. §6 and §7 cover architecture and sizing. §8 and §9 cover the data model and APIs. §10 is the deep-dive section: auction flow, DSP selection, bid optimization, ad serving, spend tracking, fraud. §11–§15 cover bottlenecks, failures, deployment, observability, and security.

TL;DR: A user in Texas abandons a $130 pair of Nike running shoes in their cart on Saturday night. Sunday morning they open ESPN on their phone, and inside about a tenth of a second a Nike ad for those exact shoes shows up in the page. Pulling that off involves a CDP, an identity graph, a campaign manager, a DSP that ran an ML bidder, a header-bidding wrapper running in the browser, four SSPs, an exchange, a publisher ad server, a CDN, a verification vendor, and an attribution stack. The exchange is the marketplace that runs the auction. It owns no campaigns, no budgets, no creatives; those all live in the DSPs. Its job is taking supply from SSPs, picking the right DSPs to ask, running a fair first-price auction, returning markup, tracking impressions on its own books, and reconciling settlement at the end of the day. At a million requests per second the things that actually matter are: don't fan out to all 50 DSPs (pick the top 5), don't put a network hop in front of every enrichment lookup (cache in-process), don't log every losing bid to Kafka (sample), and never trust a DSP's own impression count for billing.

1. How One Ad Actually Gets Served

A worked example is easier than another diagram. Imagine someone in Austin who looked at running shoes on nike.com on Saturday night. They added a $130 Nike Air Zoom Pegasus 41 to their cart, got distracted, closed the tab, and didn't think about it again. On Sunday morning they're scrolling ESPN on their phone, tap into a Cowboys-Giants game recap, and a banner ad in the middle of the article shows the exact shoes from last night. They tap, the page goes back to nike.com, and they buy.

Fourteen different companies are involved in the roughly tenth of a second between the page starting to load and that banner becoming visible. The rest of this post designs one of them, the ad exchange. The walkthrough below exists so it's clear what the other thirteen are doing.

The companies involved

On the advertiser side, before the auction even happens: Nike is the advertiser, and Nike's media agency (WPP) actually runs the day-to-day buying. The agency uses Google Campaign Manager 360 to store the creatives, the budget, the flight dates, and the Floodlight tracking pixels for conversion attribution. Nike's website fires events into Segment, a CDP, which captures things like "added to cart" and builds an audience from them. Segment forwards those audiences to The Trade Desk, which is the DSP that will actually bid in the auction. LiveRamp sits alongside as the identity graph: it ties the user's hashed email from a previous nike.com login to a stable RampID that links their laptop cookie to their iPhone IDFA, which is how the cart-abandon signal from the laptop ends up matched to the iPhone session on ESPN.

In the runtime path on the publisher side: ESPN owns the page and the 300×250 ad slot. ESPN's page has Prebid.js (a header-bidding wrapper) configured against four SSPs (Magnite, PubMatic, Index Exchange, OpenX) and a Google Publisher Tag pointing at Google Ad Manager, which is ESPN's ad server. When the page loads, Prebid.js runs in the browser and asks all four SSPs in parallel. The SSPs each forward into one or more exchanges, including Google AdX, which is the system this post is designing. AdX picks the top 5 of its registered DSPs and runs the auction.

After the impression renders: CloudFront serves the banner image from an Austin edge POP. Integral Ad Science fires a viewability beacon from inside the creative to confirm the ad was actually visible for at least a second. When the click happens, Floodlight (part of CM360) and Google Analytics 4 fire on the Nike order-confirmation page to attribute the purchase back to the Trade Desk click on ESPN.

That's the cast: Nike, WPP, CM360, Segment, LiveRamp, The Trade Desk, Magnite (plus PubMatic, Index, OpenX), Prebid.js, Google Ad Manager, Google AdX, ESPN, CloudFront, IAS, GA4 + Floodlight. Some of these are the same company in different roles (Google in particular shows up four times), but the products are distinct.

How the request actually flows

The night before

Segment's JavaScript on nike.com sees the cart-add and the eventual session timeout, and writes the user into the cart_abandoners_shoes_7d audience.
Every fifteen minutes Segment reverse-ETLs new audience members into The Trade Desk. By the time the user goes to bed, TTD already knows about them.
LiveRamp attaches a RampID to the user's hashed email so the laptop cookie and the iPhone IDFA resolve to the same identity the next morning.

Sunday morning, the page loads

The browser requests espn.com/nfl/cowboys-giants-recap. ESPN returns HTML and JavaScript.
Prebid.js initializes and identifies the 300×250 slot in the article body:

html

<article>Cowboys quarterback Dak Prescott threw for 301 yards...</article>
<div id="div-gpt-ad-midarticle"></div>
<article>In the second half, the Giants...</article>

<script>
pbjs.addAdUnits([{
  code: 'div-gpt-ad-midarticle',
  mediaTypes: { banner: { sizes: [[300, 250]] } },
  bids: [
    { bidder: 'magnite',  params: { accountId: 'espn-001' } },
    { bidder: 'pubmatic', params: { publisherId: 'espn-002' } },
    { bidder: 'ix',       params: { siteId: 'espn-003' } },
    { bidder: 'openx',    params: { unit: 'espn-004' } }
  ]
}]);
</script>

Prebid fans out to all four SSPs in parallel.
Each SSP enforces ESPN's $3 sports-vertical floor, attaches an schain object, and forwards into one or more exchanges. Magnite forwards into AdX.

Inside AdX, the auction

Enrichment: an in-process LRU lookup returns consent state, a cookie-sync map ({ttd: ttd_abc, dv360: dv_xyz, amzn: amzn_123}), and the user's RampID. Cache hit, no network.
Pre-bid fraud filter: IP reputation, ASN check, user-agent sanity. All clean.
Smart DSP selection: 50 DSPs are registered but only five get asked. AdX scores each one on geo, format, vertical, win rate, and capacity, and picks TTD, DV360, Amazon DSP, Criteo, and Xandr.
Fan-out: bid requests go out in parallel over HTTP/2 with a 60ms timeout.

TTD's bid (the winner)

Audience match: user is in cart_abandoners_shoes_7d.
Campaign match: Pegasus 41 Retargeting is eligible.
Frequency cap: 0/3 today, allowed.
ML model: pCTR 8% (very high — fresh cart abandoner looking at the same product), pCVR 12%, expected value ~$1.25 per impression, maximum CPM well over $1,000.
Shading model: ESPN sports impressions usually clear around $8, so TTD bids $8.50.
Response carries the price plus ready-to-render HTML for the Pegasus banner with a Floodlight pixel embedded.

The other four DSPs

DV360: $5.00 (generic shoe retargeting).
Criteo: $6.50 (also tracks nike.com).
Amazon DSP: $4.00 (Nike sells on Amazon).
Xandr: 204 no-bid.

Resolution and render

AdX picks TTD at $8.50, substitutes the ${AUCTION_PRICE} macro with an encrypted price token, validates the creative against ESPN's blocked-categories list, and returns the markup to Magnite.
Prebid compares all four SSPs (PubMatic $7.00, Index $5.00, OpenX $3.50) and picks Magnite as the overall winner.
GAM checks direct deals: Ford has a homepage sponsorship that doesn't apply to NFL articles, Progressive's guaranteed deal is desktop-only. Prebid's $8.50 beats the line-item stack.
The browser fetches the banner from CloudFront's Austin POP (~12ms cache hit). IAS's IntersectionObserver beacon starts watching the slot.
Ad becomes visible roughly 110ms after the page started rendering.

After the impression

Exchange impression pixel, Floodlight pixel, and IAS viewability beacon all fire. IAS confirms 60% of pixels in view for over a second, counts it as viewable per MRC.
The user taps the banner two seconds later. The click hits the exchange's /t/click endpoint, gets logged to Kafka, and 302-redirects to nike.com/pegasus-41?utm_source=ttd&utm_medium=retargeting.
The cart is still there. The user buys.
The order-confirmation page fires the Floodlight conversion pixel. CM360 attributes the $130 purchase back to the TTD click on ESPN.

Where the $8.50 actually goes

Of the $8.50 the advertiser pays for that impression, only about $5.30 makes it to ESPN. The rest is split among the intermediaries: roughly $1.00 to TTD as the DSP fee, $0.15 to LiveRamp for the identity match, $0.80 to AdX as the exchange take rate, $1.15 to Magnite as the SSP fee, and around $0.10 to IAS for the viewability measurement. CloudFront's bandwidth cost is rounded into ESPN's hosting bill and is essentially free per impression.

Actor	Cut
Nike (advertiser, gross)	$8.50
The Trade Desk (DSP)	$1.00
LiveRamp (identity match)	$0.15
Google AdX (exchange)	$0.80
Magnite (SSP)	$1.15
IAS (verification)	$0.10
ESPN (publisher net)	$5.30

Roughly 37% of the gross goes to ad-tech middlemen. That's the number publishers point at when they argue for Supply Path Optimization (collapsing redundant SSP and exchange hops to recover more of the dollar). Not counted here: WPP's agency commission (separate, around 10–15% of media spend), the Segment SaaS subscription, the CM360 license, and any flat fees baked into the relationships. Those are negotiated outside the auction.

A few things people get wrong

Most confusion in this ecosystem comes from product names that sound similar. Google Ads is the small-business-facing DSP interface, while DV360 is the enterprise DSP, AdX is the exchange, GAM is the publisher ad server, and CM360 is the campaign manager. Five different products under the same brand. Inside that pile, AdX is the one running auctions and the one this post is about.

People sometimes think the SSP runs the auction. It doesn't. The SSP packages publisher inventory and forwards to the exchange. To make matters worse, GAM is both an SSP and an ad server, and some SSPs (Magnite is the obvious one) run their own internal auctions before forwarding to a downstream exchange.

The exchange is also not where campaigns live. Campaigns, budgets, creatives, frequency caps, pacing, ML bid optimization: all of that lives inside DSPs. The exchange only sees bids and no-bids.

Two more. Prebid.js is not an SSP, it's a header-bidding wrapper that runs in the browser and calls multiple SSPs in parallel. And a CDP (Segment, mParticle) is not the same thing as a DMP or identity graph (LiveRamp). The CDP captures first-party events with known identifiers; the identity graph resolves identity across devices and historically held third-party segments.

With that out of the way, the rest of the post designs the exchange itself.

2. Problem Statement

Online advertising is what pays for most of the open web. Every time a page or app loads, an auction has to happen in the background and resolve before the content finishes rendering. The wall-clock budget is around 100ms from when the ad request leaves the publisher to when winning markup comes back. Miss it and the slot stays empty: the publisher loses revenue, the advertiser loses reach.

The exchange sits in the middle of that auction. It receives supply from SSPs, picks DSPs to ask, runs a first-price auction, returns the winner, tracks the impression independently, and settles money at the end of the day. None of that is novel to describe. What makes it hard is the combination of latency, fan-out, and the fact that every auction is settling real money.

The latency constraint is the dominant one. A naive design would fan every bid request out to every registered DSP. With 50 DSPs and 1KB requests, that's 50KB of outbound traffic per auction multiplied by a million auctions per second, or 50 GB/sec leaving the exchange, and you're waiting on the slowest DSP every single time. The realistic answer is to score each DSP per request on geo, format, vertical, win rate, and capacity, and only fan out to the five most likely to bid competitively. Section 10.2 covers the scoring in detail; the point here is that the right baseline is "top 5 of 50," not "all 50."

Scale matters even after that fan-out trim. A serious exchange runs around 1M ad requests per second sustained, peaking near 2M during US evening prime time. With the top-5 selection that's 5M outbound bid requests per second. Around 100 auction-server pods spread across three regions, every component has to scale horizontally, and every hot-path lookup has to be sub-millisecond.

Every auction is also money. An auction bug that lets a $0.10 bid beat a $2.00 bid leaks $1.90 per impression; at 300K impressions per second that's $570/sec, or $50M a day. Auction integrity is the core product guarantee. Winning bids and impressions are logged at 100%, losing bids are sampled, and DSP-reported impressions are reconciled against the exchange's own tracking nightly so any drift gets caught.

Underneath all of that sit the long-running concerns: bot traffic, domain spoofing, headless browsers, click farms, datacenter-hosted "users." Ads.txt and sellers.json exist specifically to stop domain spoofing. GDPR and CCPA limit what can cross a wire without consent. Every request has to be fraud-filtered and PII-scrubbed in under 2ms combined. None of these are theoretical. A single GDPR violation can run 4% of global revenue.

Quick numbers to anchor the rest of the post:

Metric	Target
Ad requests per second (sustained)	1,000,000
Ad requests per second (peak)	2,000,000
Impressions per second (30% fill)	~300,000
Impressions per day	~25 billion
DSPs registered	50+
DSPs per auction (top-N)	5
Auction latency p99	< 100 ms
DSP response timeout	60 ms
Monthly spend through exchange	~$4 billion
Exchange take rate	10–15%

A few things to avoid: don't fan out to every DSP, don't put Valkey in the hot path for every enrichment lookup (front it with an in-process cache), don't log every losing bid to Kafka (sample them), don't call DSPs sequentially, and don't trust DSP-reported impression counts for anything that touches money.

3. Functional Requirements

ID	Requirement	Priority
FR-01	Accept bid requests from SSPs via OpenRTB 2.6 and run first-price sealed-bid auctions in < 100 ms p99	P0
FR-02	Smart-select top 5 eligible DSPs per auction and fan out in parallel with 60 ms timeout	P0
FR-03	Enforce publisher floor prices and block-list (categories/advertisers) per publisher config	P0
FR-04	Return winning ad markup (HTML for banner, VAST XML for video) to the SSP within budget	P0
FR-05	Serve ad creatives through CDN edge nodes with cache headers for efficient delivery	P0
FR-06	Track impressions via server-side pixel (1×1 GIF) with deduplication	P0
FR-07	Track clicks via redirect URL with destination validation	P0
FR-08	Enforce exchange-level creative dedup (max N impressions of same creative per user per hour) as an ad-quality measure. Per-campaign frequency capping is a DSP responsibility.	P1
FR-09	Track per-DSP spend in near-real-time via Flink streaming for credit-limit enforcement and settlement	P0
FR-10	Validate supply chain: ads.txt, sellers.json, and schain object on every request	P0
FR-11	Check consent (TCF / US Privacy string) and strip PII from bid requests when required	P0
FR-12	Publish tracking events (impressions, clicks, viewability, auction results) to Kafka for billing and analytics	P0
FR-13	Reconcile exchange-tracked impressions with DSP-reported impressions daily; flag discrepancies > 0.01%	P1
FR-14	Provide publisher and DSP management APIs (floor prices, DSP onboarding, settlement reports)	P1
FR-15	Support banner, video (VAST 4.2), and native ad formats	P0
FR-16	Pre-bid fraud filtering: IP reputation, user-agent signature, datacenter detection, ASN reputation	P0

4. Non-Functional Requirements

Dimension	Target
Auction latency (p50)	< 50 ms
Auction latency (p99)	< 100 ms
Fill rate	> 30% (varies by publisher and market)
Availability	99.95% (4.4 hours/year planned + unplanned downtime)
Tracking pipeline loss	< 0.01% event loss end-to-end
Billing accuracy (reconciled)	±0.01% of DSP-reported impressions
CDN cache hit rate	> 95%
DSP connection pool warm starts	All DSPs kept warm via periodic health pings
Multi-region failover	< 60 seconds (DNS-based geo failover)
Deployment rollback	< 5 minutes for any component

5. High-Level Approach & Technology Selection

5.1 The full ecosystem

The walkthrough in §1 named most of the actors. The map below is the same cast in table form, useful as a reference when later sections refer to a specific role.

Layer	Role	Examples
Advertiser	Pays for ads. Sets goals, budgets, targeting.	Nike, P&G, a local dentist
Agency	Runs media buying on behalf of advertisers. Contracts with DSPs.	WPP/GroupM, Publicis, Omnicom
Campaign Manager	Stores creatives, flight dates, budget rules, attribution tags. Publishes campaigns to DSPs.	Google Campaign Manager 360 (CM360), Adobe Advertising
CDP	Captures first-party events from advertiser sites. Builds audiences. Syncs to DSPs.	Segment, mParticle, Treasure Data
Identity graph / DMP	Resolves user identity across devices and cookies. Provides stable cross-device IDs.	LiveRamp (RampID), Neustar Fabrick, ID5
DSP	Receives bid requests from exchanges. Runs bid optimization ML. Decides to bid and at what price. Owns campaign budgets and frequency caps.	The Trade Desk, DV360, Amazon DSP, Criteo, Xandr
Ad Exchange	This system. Runs the auction. Receives supply from SSPs, selects and fans out to DSPs, picks a winner.	Google AdX, OpenX, Index Exchange, PubMatic, Magnite
SSP	Packages publisher inventory. Enforces floor prices, brand safety rules. Forwards bid requests to exchanges and direct DSPs.	Magnite, PubMatic, Index Exchange, OpenX, Xandr Monetize
Header bidder	Client-side JavaScript that calls multiple SSPs in parallel from the browser, then picks the highest bid.	Prebid.js, Amazon TAM
Publisher ad server	Owns the ad slot. Decides between direct deals, guaranteed deals, and programmatic (Prebid) bids.	Google Ad Manager (GAM), Kevel, FreeWheel
Publisher	Owns the website or app. Gets paid per impression.	ESPN, CNN, NYT, mobile game developers
Verification	Measures viewability, brand safety, invalid traffic. Runs JavaScript beacons.	Integral Ad Science (IAS), DoubleVerify, MOAT
Attribution & analytics	Tracks conversions. Attributes them to impressions and clicks.	Google Analytics 4, Floodlight (CM360), Adjust (mobile), AppsFlyer (mobile)
CDN	Serves creative assets from edge POPs.	CloudFront, Fastly, Akamai, Cloudflare

The runtime relationships look like this:

The exchange makes its money by charging a take rate (usually 10–15%) on each cleared auction. If a DSP bids $8.50 CPM and wins, the exchange keeps about $0.80 and the rest goes to the SSP and publisher. Billing happens monthly against exchange-tracked impressions, reconciled with DSP-reported numbers; anything more than 0.01% off gets investigated.

Boundary the rest of this post uses: campaigns, budgets, creatives, bid optimization, frequency capping, and conversion attribution all live inside DSPs. The exchange is a stateless marketplace that sees only bid requests, bids, wins, impressions, and clicks.

5.2 First-price auctions

The industry shifted from second-price to first-price auctions around 2017–2019. In a first-price auction the winner pays exactly what they bid. That's simpler for the exchange and more transparent for the DSP, with one tradeoff: DSPs now have to bid below their true valuation (bid shading) to avoid systematically overpaying, since they no longer get the safety of paying only the second-highest price. Bid shading lives entirely inside the DSP, so the exchange doesn't need to know about it.

The trigger for the move was a trust problem. In second-price exchanges that handled both supply and demand, some operators were caught using their knowledge of all bids to give favored buyers a "last look" at winning prices. First-price ends that ambiguity because there's nothing to manipulate; the winner pays what the winner said.

	Second-price (legacy)	First-price (current)
Winner pays	Second-highest + $0.01	Their own bid
DSP strategy	Bid truthfully	Bid shade (0.5–0.85 × true value)
Exchange complexity	Higher (track top-2 bids)	Lower (track max bid)
Transparency	Low (exchanges could manipulate)	High
2024+ adoption	Declining	Dominant

5.3 OpenRTB 2.6

Every DSP and SSP speaks OpenRTB, currently version 2.6, and every exchange has to as well. The protocol defines the wire format for bid requests, bid responses, win notices, and the supporting objects (Site, App, User, Device, Imp, Bid). Sections 9.1 and 9.2 show real payloads. The one detail worth flagging here: the adm field in a bid response carries actual render-ready markup (HTML for banner ads, VAST XML for video), not just a URL. The DSP is responsible for providing markup the browser can execute; the exchange substitutes a few macros (auction price, click URL, impression URL) before forwarding.

Object	Purpose	Key Fields
`BidRequest`	Top-level request from exchange to DSP	`id`, `imp[]`, `site`/`app`, `user`, `device`, `regs`, `tmax`
`Imp`	One ad slot	`id`, `banner`/`video`/`native`, `bidfloor`, `pmp`
`Site`/`App`	Publisher context	`domain`, `page`, `cat[]`, `publisher`
`User`	User targeting	`id`, `buyeruid`, `geo`, `data[]`, `consent`
`Device`	Device info	`ua`, `ip`, `geo`, `devicetype`, `os`
`BidResponse`	DSP's response	`id`, `seatbid[]`, `cur`
`Bid`	Individual bid	`id`, `impid`, `price`, `adm` (creative markup), `crid`, `adomain[]`

5.4 The components inside the exchange

Eight services, most of them stateless. The auction server is the hot path: it accepts OpenRTB requests from SSPs, enriches them, runs DSP selection, fans out, runs the auction, and returns the winning markup. Written in Go, with an in-process LRU cache fronting everything that would otherwise need a network lookup. The DSP selection logic isn't a separate service, it's a library inside the auction server.

The ad server handles macro substitution on the winning markup and assembles VAST XML for video. The tracking endpoint is a separate pod group that takes impression pixels, click redirects, and viewability beacons; it deduplicates against Valkey, produces to Kafka asynchronously, and returns the pixel as fast as possible. The creative dedup service is a small Valkey-backed counter that prevents the same creative ID from showing to the same user more than N times an hour. That's an ad-quality measure, not per-campaign frequency capping; per-campaign caps live in DSPs.

For billing, a Flink job consumes the impressions topic keyed by DSP ID and writes per-DSP running spend totals to Valkey, where the auction servers can read them for credit-limit enforcement. The same Flink job also computes rolling win rates that feed DSP selection. A daily batch job aggregates ClickHouse data per DSP and per publisher, reconciles against DSP-reported numbers, and writes settlement records. The management API is the boring part: CRUD for publisher configs, DSP onboarding, settlement queries.

5.5 Storage

Store	Technology	Rationale
Enrichment hot-path cache	In-process LRU (ristretto)	5-second TTL. Serves > 95% of enrichment reads with zero network hops.
Enrichment cold-path	Valkey Cluster	Sub-ms reads on cache miss. Sharded by user ID. Background-synced from PostgreSQL + event streams.
Auction event log (sampled)	Kafka	Durable event stream. 100% of impressions and winning bids; 1% sample of losing bids.
Real-time analytics & billing	ClickHouse	Columnar analytics on billions of rows. Sub-second aggregation for dashboards.
Exchange configuration	PostgreSQL	DSP configs, publisher settings, SSP registrations, settlement records. Read replicas for auction servers.
Bid-level archive	S3 / Iceberg (Parquet)	Long-term storage of winning bids and sampled losses. For billing disputes and ML training.
Creative assets	S3 + CloudFront	Originless serving via CDN with > 95% cache hit rate.

5.6 Why Go

Go is a defensible choice for this rather than a load-bearing one. Goroutines make the parallel fan-out pattern trivial: each DSP call is a goroutine, the timeout is a context, and the auction starts as soon as the last bid comes in or the deadline fires. GC pauses with modern Go (1.22+) are well under a millisecond, which matters for tail latency. The standard library has a good HTTP/2 client with built-in connection pooling and multiplexing, so there's no third-party dependency for the most performance-sensitive piece.

Rust would give marginally better tail latency and zero GC, but at a real cost in development speed and the ability to staff the team. Java with Netty and virtual threads is the other reasonable answer; it's slightly harder to keep G1GC pauses under the auction budget but plenty of large exchanges run on the JVM. The team's existing Go skills are usually the deciding factor.

5.7 Why ClickHouse

At 25 billion impressions a day, the dashboard queries are aggregations over 100+ billion rows ("spend by DSP by publisher by hour for the last 7 days," that kind of thing). ClickHouse handles that in single-digit seconds. Druid is the other serious option but adds operational complexity. BigQuery works but the per-query costs add up fast at this scale, and Postgres simply can't move enough rows. ClickHouse also has a native Kafka table engine, so ingestion is essentially zero-code: a MergeTree table materializes from a Kafka topic automatically.

6. High-Level Architecture

6.1 Multi-region bird's eye

The exchange runs in three regions: us-east-1, eu-west-1, and ap-south-1. Geo-DNS or Anycast routes each SSP request to the nearest region. Inside a region everything is stateless or regionally-sharded. Cross-region state (DSP configs, publisher settings, billing records) lives in PostgreSQL with logical replication out from a single primary in us-east-1.

6.2 Decisions worth flagging

Auction servers are stateless. They cache config (DSP endpoints, publisher settings) refreshed every 30 seconds, plus an in-process LRU of enrichment data refreshed asynchronously. No durable local state. Any pod can serve any request, so horizontal scaling is just adding pods.

Load balancing is L4. TLS termination at the load balancer is too expensive at a million QPS, so the L4 balancer hands TCP connections to auction servers via consistent hashing and TLS terminates inside the auction server, parallelized across cores.

The in-process LRU is the cache layer that does most of the work. Hit rate is well above 95% during peak traffic because the same users show up in many simultaneous auctions, and a 5-second TTL is short enough that staleness isn't a real concern. Cache misses fall through to Valkey. A background worker keeps hot keys warm. Without this layer, Valkey would need to handle 5M ops/sec just for enrichment and the wire latency would dominate the auction budget.

Fan-out happens inside the auction server itself, not in a separate service. Every pod maintains persistent HTTP/2 connection pools to every registered DSP, which saves a network hop versus a separate fan-out tier and gives tighter control over per-DSP timeouts and circuit breakers.

Kafka is the universal event bus. Winning bids, impressions, clicks, viewability beacons, DSP config updates: they all flow through Kafka topics. The auction server produces asynchronously and never waits for ack on the hot path.

Sampled bid logging keeps Kafka manageable. Winning bids and impressions go in at 100%. Losing bids go in at 1%. The full bid stream tees out separately to S3/Iceberg through Kafka Connect, which is cheap durable storage for billing disputes and ML training without burning hot Kafka capacity.

CDN-first creative delivery: creatives live on S3 and get pushed to CloudFront. The auction server never touches creative bytes; it returns markup pointing at a CDN URL.

6.3 Auction flow, happy path

6.4 Auction flow, timeout

When the 60ms DSP timeout fires, the auction server runs the auction with whatever bids have arrived. Zero bids above floor means a no-fill response back to the SSP. Slow DSPs feed into a per-DSP circuit breaker, covered in §12.2.

7. Back-of-the-Envelope Sizing

Every number here is rounded so you can redo the math in your head if you want.

7.1 Request volume

Sustained:  1,000,000 QPS
Peak:       2,000,000 QPS (US evening prime time)
Design for: 1,500,000 QPS with headroom

Per day:    1M × 86,400 ≈ 86 billion bid requests/day
Fill rate:  30%
Impressions: 86B × 0.30 ≈ 26 billion/day
            ≈ 300,000 impressions/sec

7.2 DSP fan-out

Naive (fan out to all 50 DSPs):  1M × 50 = 50M bid requests/sec
Top-5 smart selection:           1M × 5  =  5M bid requests/sec

Bid request:  ~1 KB (OpenRTB JSON, gzipped on the wire)
Bid response: ~0.5 KB

Outbound: 5M × 1 KB   = 5 GB/sec
Inbound:  5M × 0.5 KB = 2.5 GB/sec
Total:    ~7.5 GB/sec across all auction servers

The top-5 selection is what makes the bandwidth (and the per-DSP cost) tractable. Without it the exchange is wasting an order of magnitude on requests no DSP would have bid on anyway. §10.2 has the scoring algorithm.

7.3 Auction server sizing

Per-auction latency budget:
  LRU cache hit (95%):   0.1 ms
  Valkey cold (5%):      1.0 ms  (amortized 0.05 ms)
  Pre-bid filter:        1.0 ms
  DSP selection:         1.0 ms
  DSP fan-out (parallel, 60 ms timeout): 40 ms avg
  Auction logic:         0.5 ms
  Macro sub + response:  1.0 ms
  Total p50:             ~45 ms
  Total p99:             ~80 ms

Per pod (c6g.4xlarge, 16 vCPU, 32 GB RAM):
  Concurrent in-flight auctions: ~2,000
  QPS per pod:                   ~25,000

Pods needed:
  Sustained 1M / 25K = 40 pods
  Peak 2M / 25K      = 80 pods
  Deploy 100 pods across 3 regions (40 US + 30 EU + 30 APAC) with HPA to 2x

The latency budget breaks down something like this: an LRU hit takes a fraction of a millisecond, the rare Valkey cold path adds maybe another, pre-bid filtering and DSP selection together cost 2ms, the parallel DSP fan-out is 40ms on average and 60ms in the worst case, and the auction logic plus serialization is the rest. p50 lands around 45ms; p99 under 80ms.

7.4 Cache and Valkey

Enrichment lookups per auction: 3 logical keys
  - user consent + cookie-sync (1 hash)
  - IP/UA fraud flags (1 set membership)
  - DSP credit-limit flags (1 hash)

At 1M QPS:
  LRU hits (95%):   2.85M logical lookups/sec in-process, zero network
  Valkey cold (5%): 150K ops/sec, trivial for a Valkey cluster

Valkey working set:
  Active users (30-day):     ~200 million
  Per-user entry:            ~150 bytes (consent + cookie sync)
  Total users:               200M × 150 = 30 GB
  Fraud lists (IPs + UAs):   ~1 GB
  DSP credit state:          negligible
  Creative dedup counters:   ~10 GB
  Total:                     ~42 GB

Valkey cluster: 3 primaries (16 GB each) + 3 replicas = 6 nodes per region.

7.5 Kafka (with sampling)

Event topics:
  impressions:      300K/sec × 500 bytes = 150 MB/sec
  clicks:             3K/sec × 300 bytes =   1 MB/sec
  viewability:      300K/sec × 200 bytes =  60 MB/sec
  winning_bids:     300K/sec × 800 bytes = 240 MB/sec
  losing_bids (1%):  50K/sec × 800 bytes =  40 MB/sec

  Total:            ~490 MB/sec
  × replication 3  = 1.5 GB/sec write throughput

Per day:
  490 MB/sec × 86,400 = 42 TB/day ingested
  × zstd 4x compression ≈ 10 TB/day on disk
  Hot retention 3 days = 30 TB

Kafka cluster (per region): 10 brokers × 4 TB NVMe = 40 TB
  ~500 MB/sec ingest per region fits at <30% capacity.

The unsampled bid stream gets teed directly to S3/Iceberg through Kafka Connect, so durable long-term storage doesn't sit on hot Kafka brokers.

7.6 ClickHouse

Ingest rate:
  impressions:  300K rows/sec
  clicks:         3K rows/sec
  winning_bids: 300K rows/sec (separate table)
  Total:       ~600K rows/sec

Row sizes (after compression):
  impression row: ~60 bytes compressed
  Per day:        26B × 60 = 1.5 TB/day compressed
  90-day hot:     135 TB

Cluster (per region):
  4 shards × 3 replicas = 12 nodes
  Each: r6g.4xlarge, 4 TB NVMe
  Total: 48 TB per region
  TTL moves > 30-day data to S3 tiered storage.

7.7 CDN

300K impressions/sec × 200 KB avg creative = 60 GB/sec egress
CDN cache hit rate > 95% → origin pulls < 3 GB/sec
Daily egress: 60 GB/sec × 86,400 ≈ 5 PB/day

Unique creatives: ~500K (top 1% serve 60% of requests, heavy head and long tail)
Creative total storage on origin: 500K × 200 KB = 100 GB
CDN POP cache: ~10 GB hot working set per POP

7.8 Summary

Resource	Number
Auction server pods (global)	100
Valkey nodes (global, 3 × 6)	18
Kafka brokers (global, 3 × 10)	30
ClickHouse nodes (global, 3 × 12)	36
Outbound DSP bandwidth	~7.5 GB/sec
CDN egress	~60 GB/sec
Monthly AWS + CDN bill (rough)	$8–12M
Monthly revenue at 10% take rate	~$400M

A reasonable check on the economics: at $4B/month gross spend through the exchange and a 10% take rate, that's about $400M/month in revenue against $8–12M/month in infrastructure. Around 3% of revenue going to compute and bandwidth is what makes the business work. Smart DSP selection, in-process caching, and Kafka sampling are the three things that keep the cost line that low.

8. Data Model

8.1 Auction state machine

8.2 Core tables (PostgreSQL)

The exchange stores publisher settings, DSP configs, SSP registrations, and billing records. Campaigns, budgets, creatives, and frequency caps don't appear here; those live in DSPs.

sql

CREATE TABLE dsp_configurations (
    id                   UUID PRIMARY KEY DEFAULT gen_random_uuid(),
    dsp_name             VARCHAR(100) NOT NULL UNIQUE,
    bid_endpoint         TEXT NOT NULL,
    win_notice_endpoint  TEXT,
    max_qps              INT NOT NULL DEFAULT 100000,
    timeout_ms           INT NOT NULL DEFAULT 60,
    allowed_categories   TEXT[],
    allowed_geos         TEXT[],
    allowed_formats      TEXT[],
    seat_id              VARCHAR(50),
    circuit_breaker      JSONB NOT NULL DEFAULT '{"err_threshold": 0.5, "timeout_threshold": 0.3, "window_sec": 60, "cooldown_sec": 30}',
    historical_win_rate  DECIMAL(5,4) DEFAULT 0,
    enabled              BOOLEAN NOT NULL DEFAULT true,
    created_at           TIMESTAMPTZ NOT NULL DEFAULT now(),
    updated_at           TIMESTAMPTZ NOT NULL DEFAULT now()
);

CREATE TABLE publishers (
    id                    UUID PRIMARY KEY DEFAULT gen_random_uuid(),
    name                  VARCHAR(255) NOT NULL,
    domain                VARCHAR(255) NOT NULL UNIQUE,
    ssp_id                UUID REFERENCES ssp_configurations(id),
    floor_price_cents     INT NOT NULL DEFAULT 50,
    blocked_categories    TEXT[],
    blocked_advertisers   TEXT[],
    ads_txt_verified      BOOLEAN NOT NULL DEFAULT false,
    revenue_share_pct     DECIMAL(5,2) NOT NULL DEFAULT 85.00,
    created_at            TIMESTAMPTZ NOT NULL DEFAULT now()
);

CREATE TABLE billing_settlements (
    id                        UUID PRIMARY KEY DEFAULT gen_random_uuid(),
    settlement_date           DATE NOT NULL,
    dsp_id                    UUID NOT NULL REFERENCES dsp_configurations(id),
    publisher_id              UUID REFERENCES publishers(id),
    impressions               BIGINT NOT NULL,
    clicks                    BIGINT NOT NULL,
    gross_spend_cents         BIGINT NOT NULL,
    exchange_fee_cents        BIGINT NOT NULL,
    publisher_payout_cents    BIGINT NOT NULL,
    dsp_reported_impressions  BIGINT,
    discrepancy_pct           DECIMAL(5,4),
    status                    VARCHAR(20) NOT NULL DEFAULT 'PENDING',
    created_at                TIMESTAMPTZ NOT NULL DEFAULT now()
);

CREATE INDEX idx_settlements_dsp ON billing_settlements(dsp_id, settlement_date);
CREATE INDEX idx_settlements_pub ON billing_settlements(publisher_id, settlement_date);

ssp_configurations and creative_audit_log follow the same pattern.

8.3 Event schemas (Kafka → ClickHouse)

The impressions table is the billing source of truth.

sql

CREATE TABLE impressions (
    impression_id       String,
    auction_id          String,
    timestamp           DateTime64(3),
    dsp_id              String,
    publisher_id        String,
    publisher_domain    String,
    creative_id         String,
    advertiser_domain   String,
    price_cpm           Float64,
    user_id             Nullable(String),
    device_type         Enum8('desktop'=1, 'mobile'=2, 'tablet'=3, 'ctv'=4),
    geo_country         LowCardinality(String),
    geo_region          LowCardinality(String),
    viewable            Nullable(UInt8),
    viewability_pct     Nullable(Float32),
    time_in_view_ms     Nullable(UInt32),
    is_click            UInt8 DEFAULT 0,
    click_timestamp     Nullable(DateTime64(3))
) ENGINE = MergeTree()
PARTITION BY toYYYYMMDD(timestamp)
ORDER BY (dsp_id, publisher_id, timestamp)
TTL timestamp + INTERVAL 90 DAY;

The auction_results table follows the same shape with winning_price_cpm, num_bids_received, num_dsps_selected, and num_dsps_timeout. Writes are 100% for winners and 1% for losers.

8.4 Valkey keyspace

Key	Structure	TTL	Purpose
`user:{uid}:consent`	Hash	30d	TCF string + consent status
`user:{uid}:cookiesync`	Hash	30d	Exchange UID ↔ DSP buyer UIDs
`user:{uid}:creative:{crid}`	Counter	1h	Exchange-level creative dedup
`dsp:{dsp_id}:spend_today`	Hash	until midnight	Spend, credit limit, credit remaining
`dsp:{dsp_id}:circuit`	Hash	5m	Circuit breaker state
`dsp:{dsp_id}:winrate:{vertical}`	Float	1h	Rolling win rate feeding smart selection
`ivt:ip_blocklist`	Set	1h	Known bot IPs
`ivt:asn_reputation`	Hash	1h	ASN reputation scores (datacenter, residential)
`ivt:ua_patterns`	Set	1h	Suspicious UA regex hits
`pub:{pub_id}:config`	Hash	5m	Publisher config cache
`pub:{domain}:adstxt`	Hash	24h	Cached ads.txt entries

Per-campaign frequency caps, campaign budgets, and advertiser targeting rules don't appear here. Those live in DSPs.

9. API Design

9.1 Bid request (exchange to DSP)

POST /openrtb/2.6/bid
Content-Type: application/json
X-OpenRTB-Version: 2.6

json

{
  "id": "auc_01HXYZ123",
  "imp": [{
    "id": "imp_001",
    "banner": {"w": 300, "h": 250, "pos": 1},
    "bidfloor": 0.50,
    "bidfloorcur": "USD"
  }],
  "site": {
    "domain": "espn.com",
    "page": "https://espn.com/nfl/story/cowboys-giants-recap",
    "cat": ["IAB17"],
    "publisher": {"id": "pub_espn", "domain": "espn.com"}
  },
  "user": {
    "id": "uid_user_abc",
    "buyeruid": "ttd_user_xyz",
    "geo": {"country": "USA", "region": "TX", "city": "Austin"},
    "consent": "CPXxRfAPXxRfAAfKAB..."
  },
  "device": {
    "ua": "Mozilla/5.0 (iPhone; CPU iPhone OS 17_0 like Mac OS X)...",
    "ip": "198.51.100.42",
    "devicetype": 4,
    "os": "iOS"
  },
  "regs": {"coppa": 0, "gdpr": 0},
  "tmax": 60,
  "at": 1,
  "cur": ["USD"],
  "source": {
    "ext": {
      "schain": {
        "ver": "1.0",
        "complete": 1,
        "nodes": [{"asi": "adx.google.com", "sid": "pub_espn", "hp": 1}]
      }
    }
  }
}

DSP bid response (200 OK):

json

{
  "id": "auc_01HXYZ123",
  "seatbid": [{
    "bid": [{
      "id": "bid_ttd_001",
      "impid": "imp_001",
      "price": 8.50,
      "adm": "<div id='ad-${AUCTION_ID}'><a href='${CLICK_URL}https://nike.com/pegasus-41'><img src='https://cdn.nike.com/cr/pegasus_300x250.jpg' width='300' height='250'/></a><img src='${IMPRESSION_URL}' style='display:none'/></div>",
      "crid": "cr_nike_pegasus_01",
      "w": 300,
      "h": 250,
      "adomain": ["nike.com"],
      "cat": ["IAB18"]
    }],
    "seat": "seat_nike"
  }],
  "cur": "USD"
}

DSP no-bid: 204 No Content.

9.2 Win notice (exchange to DSP)

POST /win
{
  "auction_id": "auc_01HXYZ123",
  "bid_id": "bid_ttd_001",
  "imp_id": "imp_001",
  "price": 8.50,
  "currency": "USD",
  "timestamp": "2026-05-29T14:32:00.045Z"
}

9.3 Impression tracking

GET /t/imp?auc=auc_01HXYZ123&imp=imp_001&price=enc_xyz&dsp=ttd&pub=pub_espn&crid=cr_nike_pegasus_01
→ 200 OK, Content-Type: image/gif, 43-byte transparent GIF

The endpoint parses query params, decrypts the price token, writes to Valkey for dedup, produces to Kafka asynchronously, and returns the pixel. Target p99 under 5ms.

9.4 Click redirect

GET /t/click?auc=auc_01HXYZ123&imp=imp_001&dest=https%3A%2F%2Fnike.com%2Fpegasus-41
→ 302 Found
Location: https://nike.com/pegasus-41

Destination URLs are validated against a whitelist pattern (allowed schemes, no open-redirect loops) before the 302 is emitted.

9.5 Publisher config API

PUT /v1/publishers/{publisher_id}/config
{
  "floor_price_cents": 75,
  "blocked_categories": ["IAB25", "IAB26"],
  "blocked_advertiser_domains": ["competitor.com"],
  "revenue_share_pct": 85.00
}

9.6 DSP onboarding

POST /v1/dsps
{
  "dsp_name": "Example DSP",
  "bid_endpoint": "https://dsp.example.com/bid",
  "win_notice_endpoint": "https://dsp.example.com/win",
  "max_qps": 100000,
  "timeout_ms": 60,
  "allowed_categories": ["IAB1", "IAB17"],
  "allowed_geos": ["US", "CA"],
  "seat_id": "seat_example_001"
}
→ 201 Created { "id": "...", "status": "SANDBOX", "mtls_cert_url": "..." }

9.7 Settlement report

GET /v1/settlements?dsp_id=ttd&start=2026-06-01&end=2026-06-07
{
  "dsp_id": "ttd",
  "period": {"start": "2026-06-01", "end": "2026-06-07"},
  "totals": {
    "impressions": 1750000000,
    "clicks": 17500000,
    "gross_spend_cents": 14000000000,
    "exchange_fee_cents": 1400000000,
    "publisher_payout_cents": 12600000000,
    "dsp_reported_impressions": 1749640000,
    "discrepancy_pct": 0.0002
  }
}

10. Deep Dives

10.1 RTB auction flow, end to end

The example from §1 maps onto the timing breakdown below. The exchange's part is steps 4–8, which is the 43ms of actual work it does between receiving the request from Magnite and returning the winning markup.

Step	Component	Time	Running
1	ESPN page HTML loads, Prebid.js executes	20 ms	20 ms
2	Prebid → Magnite SSP bid request	3 ms	23 ms
3	Magnite → AdX network hop	3 ms	26 ms
4	LRU cache hit: consent, cookie-sync, DSP flags	0.1 ms	26.1 ms
5	Pre-bid fraud filter + ads.txt validate	1 ms	27.1 ms
6	Smart DSP selection (top 5 of 50)	1 ms	28.1 ms
7	DSP fan-out parallel (60 ms timeout, arrives ~40 ms)	40 ms	68.1 ms
8	First-price auction + floor + macro sub	1 ms	69.1 ms
9	AdX → Magnite network hop	3 ms	72.1 ms
10	Magnite → Prebid (selected as winner across SSPs)	3 ms	75.1 ms
11	GAM direct-deal check + render decision	5 ms	80.1 ms
12	CloudFront creative fetch (Austin POP cache hit)	12 ms	92.1 ms
13	Browser renders 300×250	15 ms	107.1 ms
14	Impression pixel fires (async)	2 ms	109.1 ms

The entire page-load-to-ad-visible budget is around 110ms in this example. The exchange's portion is 43ms; the rest is network transit, SSP coordination, GAM decisioning, and browser rendering.

The Go fan-out code:

func (e *Exchange) runAuction(ctx context.Context, req *openrtb.BidRequest) (*AuctionResult, error) {
    ctx, cancel := context.WithTimeout(ctx, 60*time.Millisecond)
    defer cancel()

    // Top-5 smart selection (see §10.2)
    dsps := e.dspSelector.SelectTopN(req, 5)

    bidChan := make(chan *DSPBid, len(dsps))
    for _, dsp := range dsps {
        go func(d *DSPConfig) {
            bid, err := e.sendBidRequest(ctx, d, req)
            if err != nil {
                e.metrics.DSPError(d.ID, err)
                bidChan <- nil
                return
            }
            bidChan <- bid
        }(dsp)
    }

    var bids []*DSPBid
    received := 0
Loop:
    for received < len(dsps) {
        select {
        case bid := <-bidChan:
            received++
            if bid != nil && bid.Price > 0 {
                bids = append(bids, bid)
            }
        case <-ctx.Done():
            e.metrics.AuctionTimeout(len(bids), len(dsps)-received)
            break Loop
        }
    }

    if len(bids) == 0 {
        return &AuctionResult{Filled: false}, nil
    }
    return e.firstPriceAuction(bids, req.Imp[0].BidFloor), nil
}

func (e *Exchange) firstPriceAuction(bids []*DSPBid, floor float64) *AuctionResult {
    var winner *DSPBid
    for _, b := range bids {
        if b.Price < floor {
            continue
        }
        if winner == nil || b.Price > winner.Price {
            winner = b
        }
    }
    if winner == nil {
        return &AuctionResult{Filled: false}
    }
    return &AuctionResult{Filled: true, Winner: winner, Price: winner.Price}
}

10.2 Smart DSP selection

The default fan-out instinct is to ask every DSP every time. At 50 DSPs and a million requests per second that means 50 million outbound bid requests per second, and the auction is bottlenecked on the slowest of fifty different bidders. Picking the five most likely to bid competitively cuts the outbound traffic by 10x, drops the per-DSP costs by the same margin, and barely moves the fill rate.

The scoring function combines hard filters (geo, format, category, capacity, circuit-breaker state) with two continuous signals: historical win rate for this segment, and a pacing factor based on how aggressively the DSP is currently spending. The hard filters drop any DSP that obviously can't or shouldn't bid. The continuous score ranks the survivors.

type DSPScore struct {
    DSPID string
    Score float64
}

func (s *DSPSelector) SelectTopN(req *BidRequest, n int) []*DSPConfig {
    geo := req.User.Geo.Country
    format := req.Imp[0].Format()
    vertical := req.Site.Cat[0]

    var candidates []DSPScore
    for _, dsp := range s.registry.All() {
        // Hard filters
        if !dsp.AcceptsGeo(geo) { continue }
        if !dsp.AcceptsFormat(format) { continue }
        if !dsp.AcceptsCategory(vertical) { continue }
        if !dsp.CapacityAvailable() { continue }
        if dsp.CircuitBreakerOpen() { continue }

        // Continuous score
        winRate := dsp.HistoricalWinRate(geo, vertical, format)
        pacing := dsp.PacingFactor()

        candidates = append(candidates, DSPScore{
            DSPID: dsp.ID,
            Score: winRate * pacing,
        })
    }

    sort.Slice(candidates, func(i, j int) bool {
        return candidates[i].Score > candidates[j].Score
    })
    if len(candidates) > n {
        candidates = candidates[:n]
    }

    // 10% exploration: occasionally include a non-top DSP to discover new demand
    if rand.Float64() < 0.10 && len(s.registry.All()) > n {
        explorer := s.registry.RandomExploration(candidates)
        if explorer != nil {
            candidates[n-1] = DSPScore{DSPID: explorer.ID, Score: 0}
        }
    }

    return s.registry.Resolve(candidates)
}

A small exploration bonus matters more than it looks. Without it, any DSP that starts with a zero win rate stays at zero forever. The fix is to replace the lowest-ranked top-5 slot with a randomly chosen non-top DSP about 10% of the time. New DSPs get a fair shot, win rates update, and the feedback loop promotes them into the regular top-N when they earn it.

The win-rate input itself comes from a Flink job (or a simpler Kafka consumer if Flink feels heavy) that aggregates the last 60 minutes of auction outcomes keyed by (dsp_id, geo, vertical, format) and writes the result to Valkey. The auction servers cache it in-process with a one-minute TTL.

The bandwidth saving from this single change:

Strategy	DSP requests/sec	Outbound bandwidth	DSPs touched
All 50 (naive)	50M	50 GB/sec	50 per auction
Top 5 (smart)	5M	5 GB/sec	5 per auction
Fill-rate delta	—	—	< 2% drop

A 10× cut in DSP-side cost and bandwidth, against a fill-rate hit small enough to be in the noise of normal day-to-day variation.

10.3 Auction types

First-price auctions are what the industry runs now. The winner pays their bid, the math is trivial, and there's nothing for the exchange to manipulate.

The earlier model (second-price, where the winner pays one cent more than the second-highest bid) encouraged truthful bidding in theory but invited "last look" abuse in practice, where exchanges with knowledge of all bids could give favored buyers a chance to bid one cent above the clearing price. First-price killed that ambiguity by making the clearing price equal to the bid.

Header bidding versus waterfall is a separate question. The waterfall model called exchanges sequentially (try exchange A first, then B, then C) which was slow and left money on the table because a higher bid in a later exchange would never see daylight. Header bidding (Prebid.js in the browser, or its server-side equivalent) calls all exchanges in parallel and picks the highest bid across all of them. Server-side header bidding is what mature publishers run today; it removes the browser-side latency cost while keeping the parallel competition.

10.4 DSP bid optimization

This is opaque to the exchange but worth understanding because DSP behavior is what determines fill rate and per-auction latency. A typical DSP bidder runs through something like:

python

class BidOptimizer:
    def compute_bid(self, req: BidRequest, campaign: Campaign) -> Optional[float]:
        features = self.extract_features(req, campaign)

        # ML: LightGBM or deep learning trained on historical data
        pctr = self.ctr_model.predict(features)
        pcvr = self.cvr_model.predict(features)

        if campaign.bid_strategy == "CPA":
            expected_value = pctr * pcvr * campaign.target_cpa
        elif campaign.bid_strategy == "CPC":
            expected_value = pctr * campaign.max_cpc
        else:  # CPM
            expected_value = campaign.max_cpm / 1000

        # Bid shading for first-price auctions
        shading = self.shading_model.predict(features)  # 0.5-0.85
        bid = expected_value * shading

        # Internal checks invisible to the exchange
        if not self.budget_allows(campaign, bid): return None
        if not self.frequency_allows(req.user_id, campaign.id): return None
        if bid < req.imp[0].bidfloor: return None

        return bid

For the example from §1, TTD's pCTR was about 8% (very high; the user was a fresh cart abandoner looking at the same product they'd left in their cart the night before), pCVR was around 12%, and the expected value worked out to $1.25 per impression, which is a maximum CPM well over $1,000. The shading model knew that ESPN sports impressions usually clear around $8 and shaded down to $8.50, comfortably above the predicted clearing price and far below the maximum.

10.5 Ad server and creative delivery

Creatives are uploaded by advertisers into their DSP, not into the exchange. The DSP pushes them to a creative CDN of its own choosing. The exchange runs an asynchronous malware and policy scan on first-seen creative_id values from DSP responses and caches the verdict; first-time creatives are allowed through optimistically and flagged for backfill scanning, with a fast block path for anything that fails.

At runtime the DSP returns either HTML (for banner) or VAST XML (for video) in the adm field of the bid response. The exchange substitutes a small set of macros before forwarding the markup to the SSP:

Macro	Replaced With
`${AUCTION_ID}`	Auction identifier
`${AUCTION_PRICE}`	Encrypted price token (AES-256-GCM)
`${CLICK_URL}`	Exchange click-tracking URL
`${IMPRESSION_URL}`	Exchange impression-pixel URL
`${CACHE_BUSTER}`	Random number to defeat caching on tracking pixels

Substitution is a single pass over the markup string, so the cost is negligible compared to the auction itself. The encrypted price token is the part that matters for billing. It's an AES-256-GCM ciphertext containing the clearing price and a timestamp. Encrypting it stops the publisher, the SSP, or any browser extension on the user's machine from reading or forging the price, which is what would otherwise let them reverse-engineer bid patterns or tamper with billing.

Viewability uses the IAB-defined IntersectionObserver beacon: at least 50% of pixels in the viewport for at least 1 second for display, 2 seconds for video. The beacon writes to Kafka via the tracking endpoint. IAS and DoubleVerify publish their own independent beacons too; the exchange's is a backup, not the primary measurement that bills go against.

For video the exchange returns VAST 4.2 XML instead of HTML. The DSP supplies the media-file URLs, the exchange wraps them with impression, click, and quartile tracking events pointing at its own tracking endpoint, and the video player on the publisher's page fires those events as the video plays.

10.6 Per-DSP spend tracking and settlement

The exchange is the financial middleman: money flows advertiser → DSP → exchange → publisher, with the exchange's take rate deducted at settlement time rather than per auction. What the exchange tracks in real time is per-DSP spend for the day, for two reasons. The first is credit limits: many DSPs run on prepaid balances, and a DSP exceeding its balance has to be cut off from auctions within seconds, not hours. The second is anomaly detection: a sudden 10× spike in a DSP's spend velocity is usually a compromised account or a runaway campaign.

python

class DSPSpendAggregator:
    def process_impression(self, dsp_id: str, imp: ImpressionEvent):
        state = self.get_state(dsp_id)
        state.spend_today_cents += int(imp.price_cpm * 100 / 1000)
        state.impressions_today += 1

        self.valkey.hset(
            f"dsp:{dsp_id}:spend_today",
            mapping={
                "spend_cents":      state.spend_today_cents,
                "impressions":      state.impressions_today,
                "credit_limit":     state.credit_limit_cents,
                "credit_remaining": state.credit_limit_cents - state.spend_today_cents,
                "last_update":      datetime.utcnow().isoformat(),
            }
        )

        if state.spend_today_cents >= state.credit_limit_cents:
            self.valkey.set(f"dsp:{dsp_id}:credit_blocked", "1", ex=3600)
            self.alert(f"DSP {dsp_id} exceeded credit limit")

        if state.spend_today_cents > state.expected_daily * 1.5:
            self.alert(f"DSP {dsp_id} spend anomaly")

The auction server's check is a single lookup against the in-process cache for the dsp:{dsp_id}:credit_blocked key. The lag from impression to enforcement is a Flink tumbling window (1 second), a Valkey write (1 millisecond), and an in-process cache TTL (5 seconds), so the worst case is roughly 6 seconds of unchecked spend after a DSP crosses its limit. At a $100K daily limit that's about $7 of overshoot per second of lag, which is acceptable in exchange for not blocking the auction path on a synchronous credit check.

Daily settlement runs at 02:00 UTC. It queries ClickHouse for spend, impressions, and clicks grouped by (dsp_id, publisher_id) for the prior day, applies the take rate, and writes a row per pair to billing_settlements. Any row whose discrepancy with the DSP-reported number is over 0.01% goes into a manual review queue. Payment batches go to publishers and invoices to DSPs once the rows clear review.

10.7 Impression and click tracking

The tracking pipeline is the source of truth for billing. Lose events and the exchange under-bills (revenue gone) or over-bills (trust gone). The whole pipeline is built with that in mind: dedupe in Valkey, produce to Kafka asynchronously, return the pixel quickly, fail open rather than fail closed.

Deduplication exists because pixels fire more than once in the wild. Browsers retry on flaky connections. Ad slots refresh and re-fire the pixel. Buggy ad tags double-fire. Without dedupe, an impression gets billed twice and the advertiser quite reasonably gets upset. The dedupe key is impression:{auction_id}:{imp_id} with a one-hour TTL, written via SETNX. On Valkey errors the code intentionally fails open and treats the impression as new, because over-counting by a fraction of a percent is much less harmful than under-counting and losing real revenue.

func (t *TrackingService) handleImpression(w http.ResponseWriter, r *http.Request) {
    auctionID := r.URL.Query().Get("auc")
    impID     := r.URL.Query().Get("imp")
    encPrice  := r.URL.Query().Get("price")

    price, err := t.decryptPrice(encPrice)
    if err != nil {
        t.metrics.InvalidPrice.Inc()
        t.servePixel(w); return
    }

    dedupKey := fmt.Sprintf("impression:%s:%s", auctionID, impID)
    isNew, err := t.valkey.SetNX(r.Context(), dedupKey, "1", time.Hour).Result()
    if err != nil {
        // Fail open: better to slightly over-count than lose revenue
        isNew = true
    }

    if isNew {
        t.kafka.ProduceAsync("impressions", auctionID, &ImpressionEvent{
            AuctionID:   auctionID,
            ImpID:       impID,
            Price:       price,
            DSPID:       r.URL.Query().Get("dsp"),
            PublisherID: r.URL.Query().Get("pub"),
            CreativeID:  r.URL.Query().Get("crid"),
            Timestamp:   time.Now(),
            UserAgent:   r.UserAgent(),
            IP:          extractIP(r),
        })
    }

    t.servePixel(w)
}

Click handling is structurally identical, with a 302 Location header instead of a transparent GIF and a destination-URL whitelist check to stop open-redirect abuse.

10.8 Fraud detection beyond IP blocklists

Basic fraud detection (blocking known bot IPs and obvious user-agent patterns) catches maybe 30% of invalid traffic on a good day. The rest needs a deeper stack. Some signals run pre-bid in the auction path because they have to be sub-millisecond; others run post-bid in a Flink job that updates reputation scores, which then feed back into the next round of pre-bid filters.

IP reputation is the cheapest signal: a Valkey set refreshed hourly from threat-intel feeds. ASN and datacenter detection use a MaxMind lookup; AWS, GCP, Azure, and Digital Ocean ASNs get flagged as datacenter traffic, which catches server-rented bot pools. User-agent entropy checks for signatures that are statistically too common for real browsers, which catches botnets running the same UA across thousands of requests. Headless browser fingerprints look for missing WebGL, missing canvas, and the canary flags Chrome sets in headless mode (when those signals make it through schain.ext).

After the impression renders, the JavaScript beacon adds another layer: cursor entropy, scroll velocity, time-in-view duration, and the time between impression and click. A click that fires 200ms after the impression is almost certainly automated. Time-in-view measurements catch ad stacking, where multiple ads are layered on top of each other so only the top one is actually visible. A per-publisher rolling viewability rate catches inventory that's quietly degrading.

Supply-chain validation (ads.txt, sellers.json, the schain object) closes the domain-spoofing loop and is covered in §15.4. Daily DSP reconciliation catches anything that slipped through both pre-bid and post-bid by comparing exchange-tracked impressions with DSP-reported impressions; any large discrepancy goes into the same manual review queue as billing disputes.

Pre-bid catches roughly 70% of known fraud in a typical month, post-bid another 20%. The remaining 10% is what drives the daily reconciliation work and the per-publisher viewability monitoring.

One more category that's easy to forget: malicious creative markup. A DSP can embed JavaScript in its adm field that does things the exchange didn't sign up for. The fix is the asynchronous creative scan from §10.5: first-seen creatives are allowed through optimistically but scanned in the background, anything that fails is blocked, and the DSP's circuit breaker counter increments.

11. Bottlenecks

There are eight things in this design that could become bottlenecks under load. A few of them only matter in theory; a couple actually bite in practice.

The one that gets the most theoretical attention is DSP fan-out. At a million QPS with 50 registered DSPs, naive fan-out would be 50 million outbound HTTP/2 requests per second, which would saturate both bandwidth and the CPU spent on serialization. This is the bottleneck that smart DSP selection (§10.2) exists to remove: top-5 cuts it to 5 million per second, which fits comfortably in the budget.

Valkey hot keys are the next obvious worry, with a viral page or a celebrity-page event causing millions of lookups for the same user ID. In practice the in-process LRU absorbs this almost completely. The 5-second TTL is short enough that staleness isn't an issue, and the same user appears in many simultaneous auctions during traffic spikes, which is exactly when LRU hit rate goes up. Cache hit rate stays above 95% even during the worst spikes seen in production.

Kafka ingestion is the bottleneck that does require care. At a million QPS, naively logging every bid (winning and losing) would be roughly 5 GB/sec, tripled by replication to 15 GB/sec written. The fix is sampling: 100% of impressions and winning bids, 1% of losing bids. The full bid stream gets teed directly to S3/Iceberg through Kafka Connect, which is much cheaper durable storage than hot Kafka and handles billing-dispute lookups without burning hot capacity.

ClickHouse query and ingestion compete for resources during peak dashboard usage. The fix is to run two clusters: one ingestion-only (Kafka consumer) and one query-only (dashboards). Materialized views handle the most common queries so analyst sessions don't reach into the raw tables more than they need to.

The tracking endpoint sees spikes of around 300K impression pixels per second and has to return inside 5ms or it starts blocking page rendering. It runs as a separate pod group with a deliberately tiny code path: parse query string, decrypt price, dedupe in Valkey, produce to Kafka asynchronously, return the GIF. No database writes on the hot path, no synchronous downstream calls.

CDN origin pulls hurt during creative rotation. When a new campaign launches with brand-new creatives, the CDN cache is cold and origin gets hit hard. The fix is a combination of DSPs pre-warming creatives to all POPs before campaign start, and the auction server quietly deprioritizing bids that point at uncached creative URLs for the first 60 seconds after a creative first appears.

DSP spend tracking lag (the 6-second worst case from Flink window plus Valkey write plus LRU TTL) is a minor source of credit-limit overshoot. Worth monitoring (the alert fires if the window grows past 10 seconds), but the dollar exposure is tiny relative to daily spend.

The slowest DSP in the top-5 is the bottleneck that bounds per-auction latency once everything else is healthy. Adaptive timeouts help: DSPs that consistently respond fast get a generous 55ms timeout, ones that consistently respond slow get a strict 35ms timeout, which has the side effect of pushing them out of the top-5 entirely once their win rate decays. The circuit breaker (§12.2) handles the harder failure modes.

12. Failure Scenarios

12.1 Valkey cluster failure

Valkey going down is unpleasant but not fatal. The in-process LRU keeps serving for its 5-second TTL, which buys a bit of breathing room. Cache misses fall through to nothing: the auction server runs in degraded mode where it strips PII (since consent state is unknown), skips fraud lookups (since the blocklists are unreachable), and skips DSP credit checks. Fill rate drops because DSPs see less data and tend to bid lower, but the exchange keeps earning revenue and the failure isn't a customer-visible outage.

The detection path uses health checks with a 3-second window. When the circuit breaker opens, every auction server flips into degraded mode within seconds. The on-call gets paged and the strict-PII flag is enabled to make sure no consent-needed data accidentally goes out.

12.2 DSP unresponsive

A DSP going slow or returning errors is much more common than total Valkey failure. The fix is a per-DSP circuit breaker with three states: closed (normal), open (all requests rejected immediately, DSP excluded from selection), and half-open (1% probe requests to detect recovery). Transition rules: closed flips to open when the error rate exceeds 50% or the timeout rate exceeds 30% in a 60-second sliding window. Open flips to half-open after 30 seconds of cooldown. Half-open flips back to closed if 10 consecutive probes succeed, or back to open if any probe fails.

Circuit breaker state lives in Valkey under dsp:{dsp_id}:circuit and gets refreshed into the in-process cache once a second. When a DSP is in the open state, smart selection skips it and the next-best DSP slides in to take its place in the top-5, so the auction barely notices. The one alert that matters here is the revenue impact alert: if excluding a top-5-by-revenue DSP drops total revenue by more than 5%, the on-call gets paged for a manual look.

12.3 Kafka degradation

Slow or partially unavailable Kafka means the auction server can't produce events at the normal rate. Each pod buffers up to 100K events in an in-memory ring (about 150 MB) and drains it once Kafka recovers. If the ring fills, events spill to local disk as WAL files, replayed on recovery. DSP spend tracking falls back to the last-known Valkey values during the outage, which means some DSPs may slightly exceed their credit limits; it's all caught and corrected on the daily reconciliation.

12.4 CDN origin failure

S3 origin going down means the CDN edges can't pull cache misses. stale-while-revalidate headers let already-cached creatives keep serving past their TTL, which covers most of the existing demand. The auction server checks a per-creative health flag before returning a bid that points at an uncached creative URL. If origin has been down more than 5 minutes, that creative is excluded and the auction picks the next-best bid. New campaigns launching during an outage are delayed.

12.5 Flink spend aggregator crash

Flink restarts lose in-memory per-DSP spend state. Checkpoints to S3 every 30 seconds keep this from being a real problem on most restarts: the latest checkpoint comes back almost immediately. If the checkpoint is more than 5 minutes stale, the bootstrap path queries ClickHouse to rebuild the day's spend totals:

sql

SELECT dsp_id, sum(price_cpm)/1000 AS spend_usd
FROM impressions
WHERE timestamp >= today()
GROUP BY dsp_id

During the ~30-second bootstrap window, auction servers rely on the last DSP credit flags Valkey had. Some DSPs may overspend by a few dollars; the daily reconciliation catches and bills it.

12.6 Auction server OOM

A traffic spike (breaking news, a major sports event going viral) can drive QPS past pod capacity. Each pod enforces a hard max_concurrent_requests = 5,000 limit and returns 503 with Retry-After: 1 past that. HPA scales on CPU (target 60%) and on a custom concurrent_auctions_per_pod metric. SSPs retry with exponential backoff and route to other exchanges if the 503s persist. The pre-provisioned headroom (100 pods at 60% utilization) absorbs about a 67% spike without scaling at all.

12.7 Shedding load when things back up

Traffic spikes, slow DSPs, and slow backends are all easier to handle by shedding low-value work early than by trying to serve everything and failing later. The shedding order is set up to drop the work that matters least first.

Step one is to drop auctions with floor prices under $0.50 CPM. They can't produce meaningful revenue and the latency budget is better spent elsewhere. Step two is to drop tier-3 publishers (the lowest-revenue-share contracts) before touching premium publishers. Step three is to turn off the 10% exploration bonus in DSP selection and only fan out to the top-N in pure rank order. Step four, only under sustained overload, is to reduce the fan-out from top-5 to top-3.

The DSP timeouts adapt the same way. A worker checks p95 response time per DSP every 60 seconds and rebalances: DSPs averaging under 30ms get a 55ms timeout, ones over 50ms get 35ms. Slow DSPs that can't keep up get pushed out of the top-5 naturally as their win rate decays.

The shedding response is always 203 No Content rather than 5xx. SSPs interpret 5xx as "the exchange is broken" and start routing away from it; 203 is "no fill, normal outcome" and the SSP just moves on.

13. Deployment

13.1 Multi-region layout

Region	Auction pods	Valkey	Kafka	ClickHouse	Purpose
us-east-1	40	6	10	12	Primary, North America
eu-west-1	30	6	10	12	Europe (GDPR strict mode)
ap-south-1	30	6	10	12	Asia-Pacific
Global (S3, CDN, PG)	—	—	—	—	Shared storage, config, creative CDN

Geo-DNS does latency-based routing per SSP request to the nearest region. Cross-region failover happens via DNS TTL of 30 seconds when the regional health check goes red. Postgres is one global primary in us-east-1 with read replicas in each region; config changes propagate via logical replication with about 200ms of lag, and auction servers always read from their local replica. Settlement and billing run only out of us-east-1, consuming all three regions' Kafka topics through MirrorMaker, so financial reports have a single source of truth.

13.2 Pipeline

Canary fail thresholds: auction p99 over 110ms, fill-rate drop over 2%, DSP timeout rate up by more than 5 percentage points, 5xx rate over 0.1%, RPM down more than 3%.

13.3 Rollback

Component	Method	Time
Auction server code	k8s rolling update to previous image	< 5 min
DSP config	Revert in PG, publish Kafka config event	< 30 sec
Flink spend job	Redeploy previous JAR from S3 checkpoint	< 2 min
Tracking endpoint	k8s rolling update	< 3 min
ClickHouse schema	Forward-only, columns added backward-compatible	N/A
Publisher config	Revert via API	Immediate

14. Observability

14.1 Key metrics

Metric	Type	Alert threshold
`auction.qps`	Counter	< 700K (30% below baseline) or > 1.8M (spike)
`auction.latency.p50`	Histogram	> 50 ms
`auction.latency.p99`	Histogram	> 100 ms
`auction.fill_rate`	Gauge	< 25%
`auction.revenue_per_1k`	Gauge	> 10% drop from 1h MA
`dsp_selection.top_n_time`	Histogram	> 2 ms
`dsp.{id}.response_time.p99`	Histogram	> 55 ms
`dsp.{id}.nobid_rate`	Gauge	> 95%
`dsp.{id}.circuit_breaker`	Gauge	state = OPEN
`dsp.{id}.spend_today`	Gauge	> 90% of credit limit
`tracking.impression.qps`	Counter	< 250K
`tracking.dedup.rate`	Gauge	> 5%
`lru.hit_rate`	Gauge	< 90%
`valkey.ops_per_sec`	Counter	> 1M (capacity alarm)
`valkey.latency.p99`	Histogram	> 2 ms
`kafka.consumer_lag.impressions`	Gauge	> 100K events
`clickhouse.query.p99`	Histogram	> 10 s
`cdn.cache_hit_rate`	Gauge	< 90%
`settlement.discrepancy`	Gauge	> 0.01%
`ivt.blocked_rate`	Gauge	> 15%
`load_shed.rate`	Gauge	> 1% (load-shedding kicking in)

14.2 Dashboard

┌────────────────────────────────────────────────────────┐
│  Auction QPS         │  Latency (p50/p99)             │
│  1.05M [live]        │  42ms / 78ms                   │
├────────────────────────────────────────────────────────┤
│  Fill Rate           │  Revenue $/hour                 │
│  31.8% [24h]         │  $4.2M [24h]                    │
├────────────────────────────────────────────────────────┤
│  DSP Response Matrix (top 10)                          │
│  TTD:    32ms ok │ DV360: 38ms ok │ Amazon: 44ms ok   │
│  Criteo: 41ms ok │ Xandr: OPEN    │ Magnite: 28ms ok  │
├────────────────────────────────────────────────────────┤
│  DSP Spend & Credit                                    │
│  TTD:   $1.2M / $5M (24%) [healthy]                    │
│  DV360: $4.8M / $5M (96%) [approaching limit]          │
├────────────────────────────────────────────────────────┤
│  Tracking: imp 310K/s, click 3.1K/s, dedup 1.1%       │
│  LRU hit rate: 96.3% │ Load shed rate: 0.0%            │
└────────────────────────────────────────────────────────┘

14.3 Distributed tracing

Every auction carries a trace ID through the whole lifecycle. OTel spans:

Trace: auc_01HXYZ123 (42ms total)
├── LRU enrichment (0.1ms) [hit]
├── Pre-bid filter (1ms)
├── DSP selection (0.8ms) → [ttd, dv360, amazon, criteo, xandr]
├── DSP fan-out (38ms)
│   ├── ttd     req (32ms) ok bid $8.50
│   ├── dv360   req (38ms) ok bid $5.00
│   ├── amazon  req (44ms) ok bid $4.00
│   ├── criteo  req (41ms) ok bid $6.50
│   └── xandr   req (circuit_open)
├── Auction logic (0.5ms)
├── Macro sub + response (1ms)
├── Kafka publish (async, 2ms after response)
└── [later] Impression pixel received (t+112ms)

14.4 Alerting tiers

Tier	Trigger	Action
P0 (page now)	QPS drop > 50%, fill rate drop > 50%, all DSP circuits open	Page on-call + eng lead
P1 (page 15m)	p99 > 150 ms for 5 m, top-5 DSP circuit open, Kafka lag > 1M	Page on-call
P2 (Slack)	DSP credit > 90%, IVT rate > 15%, CDN hit < 85%, discrepancy > 0.005%	#exchange-ops
P3 (daily)	DSP no-bid rate shift > 10%, fill drop > 5%, creative rejection > 3%	Daily ops review

15. Security

15.1 Data classification

Data	Class	At Rest	In Transit
User IDs (exchange)	Pseudonymous PII	AES-256	TLS 1.3
IP addresses	PII	AES-256, hashed after 30d	TLS 1.3
Consent strings (TCF)	Regulated PII	AES-256	TLS 1.3
Auction bid data	Confidential	AES-256	TLS 1.3
Clearing prices (in markup)	Confidential	AES-256-GCM	TLS 1.3
Creative assets	Public	S3 SSE	TLS 1.3
Configurations	Internal	PG TDE	TLS 1.3
Billing records	Restricted	AES-256	TLS 1.3 + mTLS

15.2 Authentication and authorization

Actor	Auth	Scope
SSPs	mTLS + API key	Bid requests
DSPs	mTLS certificates	Receive bid requests, submit bids, receive win notices
Publishers	OAuth 2.0 + MFA	Config API, revenue dashboards
Internal services	mTLS	Service-to-service
Ops	SSO + MFA (Okta)	Dashboards, DSP config, incident response
Billing / Finance	SSO + MFA + role restriction	Settlement reports, payments

15.3 Price encryption

The clearing price embedded in the impression pixel is encrypted with AES-256-GCM. Without that, intermediaries (the publisher, the SSP, a browser extension) could read or forge the clearing price, which would let them reverse-engineer bid patterns or tamper with billing reports. The plaintext is the price plus a Unix timestamp; the timestamp lets the tracking endpoint reject any token older than 24 hours as a replay. Each token gets a fresh nonce, base64-encodes to about 40 characters, and decodes inside the impression-tracking endpoint before the impression event is written to Kafka.

15.4 Supply chain: ads.txt, sellers.json, schain

Domain spoofing is one of the older ad-tech frauds: a fraudster claims to sell nytimes.com inventory while actually owning a parked low-quality domain. Three standards plug the loop. ads.txt is published at the root of every legitimate publisher domain and lists which sellers (SSPs and exchanges) are authorized to sell that publisher's inventory; the exchange crawls these daily and refuses any request from a seller that isn't in the matching ads.txt. sellers.json is the exchange's own published list of every seller it accepts, which DSPs use to verify the exchange's claims. The schain object is attached to every bid request and lists every hop in the supply chain from publisher to exchange; any unauthorized node in the chain causes a reject.

The validation path is straightforward: cached ads.txt entries (refreshed daily) get keyed by publisher domain, and each incoming request is checked against the entries for the seller ID and a DIRECT or RESELLER relationship. A miss in the cache falls back to the configured policy, which is either strict (reject) or permissive (allow with a flag) depending on the publisher tier.

15.5 Network policy

yaml

apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
  name: auction-server-policy
spec:
  podSelector:
    matchLabels: { app: auction-server }
  policyTypes: [Ingress, Egress]
  ingress:
  - from: [{ namespaceSelector: { matchLabels: { name: load-balancer } } }]
    ports: [{ port: 8080 }]
  egress:
  - to: [{ namespaceSelector: { matchLabels: { name: data-plane } } }]
    ports: [{ port: 6379 }, { port: 9092 }]  # Valkey + Kafka
  - to: [{ ipBlock: { cidr: 0.0.0.0/0 } }]    # DSPs (external)
    ports: [{ port: 443 }]

15.6 Audit logging

Every auction result, every DSP bid, every impression, and every billing event writes to Kafka with append-only semantics. Audit topics use min.insync.replicas=3 and acks=all to make sure writes survive broker failures. The audit log archives to a separate AWS account with S3 Object Lock enabled, which means the data is write-once and can't be altered or deleted by anyone in the main account, including admins. Retention is 7 years for billing records and 2 years for bid-level logs.

Explore the Technologies

Dive deeper into the technologies and infrastructure patterns used in this design:

Core Technologies

Technology	Role in This Design	Learn More
Valkey	Cold-path enrichment, DSP credit state, fraud lists, creative dedup, circuit breakers	Redis/Valkey
Kafka	Impressions, clicks, winning bids, sampled losing bids, DSP config distribution	Kafka
ClickHouse	Real-time spend dashboards, billing aggregation, analytics	ClickHouse
PostgreSQL	DSP config, publisher settings, SSP registrations, settlement records	PostgreSQL
Flink	Per-DSP spend aggregation, rolling win-rate computation for smart selection	Flink

Infrastructure Patterns

Pattern	Relevance	Learn More
CDN and edge caching	Creative delivery via CloudFront with > 95% hit rate	CDN
Circuit breaker	Per-DSP isolation of slow or failing demand partners	Circuit Breakers
Message queues	Kafka as universal event bus, sampled writes	Message Queues
Load balancing	L4 LB across auction pods at 1M QPS	Load Balancing
Load shedding	Drop low-value auctions to preserve core path	Load Shedding

System Design: Ad Click Aggregator (10B Clicks/day, Lambda Architecture, Fraud Detection)

April 10, 2026 · 73 min read

System Design: E-Commerce Flash Sales (10M Users, Coupon System, One-Per-User Enforcement)

April 5, 2026 · 106 min read

System Design: News Aggregator (100K Sources, Dedup, Personalized Ranking)

April 5, 2026 · 86 min read

Continue Learning

Explore 30+ topics in System Design Interview Prep→

Deep dives, diagrams, and interview-ready knowledge.

CrackingWalnuts

System DesignApril 10, 2026· 66 min read

System Design: Ad Exchange (Real-Time Bidding, Sub-100ms Auctions, DSP/SSP, Impression Serving)

Goal: An ad exchange that runs real-time bidding auctions for one million ad requests per second, picks the top 5 of 50+ registered DSPs per request, finishes a first-price auction inside 100ms, enforces publisher floor prices and supply-chain rules (ads.txt, sellers.json, schain), serves winning ad creatives through a CDN, tracks impressions and clicks on its own (rather than trusting DSP-reported numbers), and reconciles billions of dollars of ad spend each month between DSPs and publishers. At about $4B/month and 300K impressions/sec, that works out to roughly 25 billion impressions a day across three regional clusters in us-east, eu-west, and ap-south.

Reading guide: §1 walks through one ad serving end-to-end, with all the actors involved. §2–§4 cover the problem and requirements. §5 introduces the ecosystem and the components inside the exchange. §6 and §7 cover architecture and sizing. §8 and §9 cover the data model and APIs. §10 is the deep-dive section: auction flow, DSP selection, bid optimization, ad serving, spend tracking, fraud. §11–§15 cover bottlenecks, failures, deployment, observability, and security.

TL;DR: A user in Texas abandons a $130 pair of Nike running shoes in their cart on Saturday night. Sunday morning they open ESPN on their phone, and inside about a tenth of a second a Nike ad for those exact shoes shows up in the page. Pulling that off involves a CDP, an identity graph, a campaign manager, a DSP that ran an ML bidder, a header-bidding wrapper running in the browser, four SSPs, an exchange, a publisher ad server, a CDN, a verification vendor, and an attribution stack. The exchange is the marketplace that runs the auction. It owns no campaigns, no budgets, no creatives; those all live in the DSPs. Its job is taking supply from SSPs, picking the right DSPs to ask, running a fair first-price auction, returning markup, tracking impressions on its own books, and reconciling settlement at the end of the day. At a million requests per second the things that actually matter are: don't fan out to all 50 DSPs (pick the top 5), don't put a network hop in front of every enrichment lookup (cache in-process), don't log every losing bid to Kafka (sample), and never trust a DSP's own impression count for billing.

1. How One Ad Actually Gets Served

The companies involved

How the request actually flows

The night before

Segment's JavaScript on nike.com sees the cart-add and the eventual session timeout, and writes the user into the cart_abandoners_shoes_7d audience.
Every fifteen minutes Segment reverse-ETLs new audience members into The Trade Desk. By the time the user goes to bed, TTD already knows about them.
LiveRamp attaches a RampID to the user's hashed email so the laptop cookie and the iPhone IDFA resolve to the same identity the next morning.

Sunday morning, the page loads

The browser requests espn.com/nfl/cowboys-giants-recap. ESPN returns HTML and JavaScript.
Prebid.js initializes and identifies the 300×250 slot in the article body:

html

<article>Cowboys quarterback Dak Prescott threw for 301 yards...</article>
<div id="div-gpt-ad-midarticle"></div>
<article>In the second half, the Giants...</article>

<script>
pbjs.addAdUnits([{
  code: 'div-gpt-ad-midarticle',
  mediaTypes: { banner: { sizes: [[300, 250]] } },
  bids: [
    { bidder: 'magnite',  params: { accountId: 'espn-001' } },
    { bidder: 'pubmatic', params: { publisherId: 'espn-002' } },
    { bidder: 'ix',       params: { siteId: 'espn-003' } },
    { bidder: 'openx',    params: { unit: 'espn-004' } }
  ]
}]);
</script>

Prebid fans out to all four SSPs in parallel.
Each SSP enforces ESPN's $3 sports-vertical floor, attaches an schain object, and forwards into one or more exchanges. Magnite forwards into AdX.

Inside AdX, the auction

Enrichment: an in-process LRU lookup returns consent state, a cookie-sync map ({ttd: ttd_abc, dv360: dv_xyz, amzn: amzn_123}), and the user's RampID. Cache hit, no network.
Pre-bid fraud filter: IP reputation, ASN check, user-agent sanity. All clean.
Smart DSP selection: 50 DSPs are registered but only five get asked. AdX scores each one on geo, format, vertical, win rate, and capacity, and picks TTD, DV360, Amazon DSP, Criteo, and Xandr.
Fan-out: bid requests go out in parallel over HTTP/2 with a 60ms timeout.

TTD's bid (the winner)

Audience match: user is in cart_abandoners_shoes_7d.
Campaign match: Pegasus 41 Retargeting is eligible.
Frequency cap: 0/3 today, allowed.
ML model: pCTR 8% (very high — fresh cart abandoner looking at the same product), pCVR 12%, expected value ~$1.25 per impression, maximum CPM well over $1,000.
Shading model: ESPN sports impressions usually clear around $8, so TTD bids $8.50.
Response carries the price plus ready-to-render HTML for the Pegasus banner with a Floodlight pixel embedded.

The other four DSPs

DV360: $5.00 (generic shoe retargeting).
Criteo: $6.50 (also tracks nike.com).
Amazon DSP: $4.00 (Nike sells on Amazon).
Xandr: 204 no-bid.

Resolution and render

AdX picks TTD at $8.50, substitutes the ${AUCTION_PRICE} macro with an encrypted price token, validates the creative against ESPN's blocked-categories list, and returns the markup to Magnite.
Prebid compares all four SSPs (PubMatic $7.00, Index $5.00, OpenX $3.50) and picks Magnite as the overall winner.
GAM checks direct deals: Ford has a homepage sponsorship that doesn't apply to NFL articles, Progressive's guaranteed deal is desktop-only. Prebid's $8.50 beats the line-item stack.
The browser fetches the banner from CloudFront's Austin POP (~12ms cache hit). IAS's IntersectionObserver beacon starts watching the slot.
Ad becomes visible roughly 110ms after the page started rendering.

After the impression

Exchange impression pixel, Floodlight pixel, and IAS viewability beacon all fire. IAS confirms 60% of pixels in view for over a second, counts it as viewable per MRC.
The user taps the banner two seconds later. The click hits the exchange's /t/click endpoint, gets logged to Kafka, and 302-redirects to nike.com/pegasus-41?utm_source=ttd&utm_medium=retargeting.
The cart is still there. The user buys.
The order-confirmation page fires the Floodlight conversion pixel. CM360 attributes the $130 purchase back to the TTD click on ESPN.

Where the $8.50 actually goes

Actor	Cut
Nike (advertiser, gross)	$8.50
The Trade Desk (DSP)	$1.00
LiveRamp (identity match)	$0.15
Google AdX (exchange)	$0.80
Magnite (SSP)	$1.15
IAS (verification)	$0.10
ESPN (publisher net)	$5.30

A few things people get wrong

The exchange is also not where campaigns live. Campaigns, budgets, creatives, frequency caps, pacing, ML bid optimization: all of that lives inside DSPs. The exchange only sees bids and no-bids.

With that out of the way, the rest of the post designs the exchange itself.

2. Problem Statement

Quick numbers to anchor the rest of the post:

Metric	Target
Ad requests per second (sustained)	1,000,000
Ad requests per second (peak)	2,000,000
Impressions per second (30% fill)	~300,000
Impressions per day	~25 billion
DSPs registered	50+
DSPs per auction (top-N)	5
Auction latency p99	< 100 ms
DSP response timeout	60 ms
Monthly spend through exchange	~$4 billion
Exchange take rate	10–15%

3. Functional Requirements

ID	Requirement	Priority
FR-01	Accept bid requests from SSPs via OpenRTB 2.6 and run first-price sealed-bid auctions in < 100 ms p99	P0
FR-02	Smart-select top 5 eligible DSPs per auction and fan out in parallel with 60 ms timeout	P0
FR-03	Enforce publisher floor prices and block-list (categories/advertisers) per publisher config	P0
FR-04	Return winning ad markup (HTML for banner, VAST XML for video) to the SSP within budget	P0
FR-05	Serve ad creatives through CDN edge nodes with cache headers for efficient delivery	P0
FR-06	Track impressions via server-side pixel (1×1 GIF) with deduplication	P0
FR-07	Track clicks via redirect URL with destination validation	P0
FR-08	Enforce exchange-level creative dedup (max N impressions of same creative per user per hour) as an ad-quality measure. Per-campaign frequency capping is a DSP responsibility.	P1
FR-09	Track per-DSP spend in near-real-time via Flink streaming for credit-limit enforcement and settlement	P0
FR-10	Validate supply chain: ads.txt, sellers.json, and schain object on every request	P0
FR-11	Check consent (TCF / US Privacy string) and strip PII from bid requests when required	P0
FR-12	Publish tracking events (impressions, clicks, viewability, auction results) to Kafka for billing and analytics	P0
FR-13	Reconcile exchange-tracked impressions with DSP-reported impressions daily; flag discrepancies > 0.01%	P1
FR-14	Provide publisher and DSP management APIs (floor prices, DSP onboarding, settlement reports)	P1
FR-15	Support banner, video (VAST 4.2), and native ad formats	P0
FR-16	Pre-bid fraud filtering: IP reputation, user-agent signature, datacenter detection, ASN reputation	P0

4. Non-Functional Requirements

Dimension	Target
Auction latency (p50)	< 50 ms
Auction latency (p99)	< 100 ms
Fill rate	> 30% (varies by publisher and market)
Availability	99.95% (4.4 hours/year planned + unplanned downtime)
Tracking pipeline loss	< 0.01% event loss end-to-end
Billing accuracy (reconciled)	±0.01% of DSP-reported impressions
CDN cache hit rate	> 95%
DSP connection pool warm starts	All DSPs kept warm via periodic health pings
Multi-region failover	< 60 seconds (DNS-based geo failover)
Deployment rollback	< 5 minutes for any component

5. High-Level Approach & Technology Selection

5.1 The full ecosystem

The walkthrough in §1 named most of the actors. The map below is the same cast in table form, useful as a reference when later sections refer to a specific role.

Layer	Role	Examples
Advertiser	Pays for ads. Sets goals, budgets, targeting.	Nike, P&G, a local dentist
Agency	Runs media buying on behalf of advertisers. Contracts with DSPs.	WPP/GroupM, Publicis, Omnicom
Campaign Manager	Stores creatives, flight dates, budget rules, attribution tags. Publishes campaigns to DSPs.	Google Campaign Manager 360 (CM360), Adobe Advertising
CDP	Captures first-party events from advertiser sites. Builds audiences. Syncs to DSPs.	Segment, mParticle, Treasure Data
Identity graph / DMP	Resolves user identity across devices and cookies. Provides stable cross-device IDs.	LiveRamp (RampID), Neustar Fabrick, ID5
DSP	Receives bid requests from exchanges. Runs bid optimization ML. Decides to bid and at what price. Owns campaign budgets and frequency caps.	The Trade Desk, DV360, Amazon DSP, Criteo, Xandr
Ad Exchange	This system. Runs the auction. Receives supply from SSPs, selects and fans out to DSPs, picks a winner.	Google AdX, OpenX, Index Exchange, PubMatic, Magnite
SSP	Packages publisher inventory. Enforces floor prices, brand safety rules. Forwards bid requests to exchanges and direct DSPs.	Magnite, PubMatic, Index Exchange, OpenX, Xandr Monetize
Header bidder	Client-side JavaScript that calls multiple SSPs in parallel from the browser, then picks the highest bid.	Prebid.js, Amazon TAM
Publisher ad server	Owns the ad slot. Decides between direct deals, guaranteed deals, and programmatic (Prebid) bids.	Google Ad Manager (GAM), Kevel, FreeWheel
Publisher	Owns the website or app. Gets paid per impression.	ESPN, CNN, NYT, mobile game developers
Verification	Measures viewability, brand safety, invalid traffic. Runs JavaScript beacons.	Integral Ad Science (IAS), DoubleVerify, MOAT
Attribution & analytics	Tracks conversions. Attributes them to impressions and clicks.	Google Analytics 4, Floodlight (CM360), Adjust (mobile), AppsFlyer (mobile)
CDN	Serves creative assets from edge POPs.	CloudFront, Fastly, Akamai, Cloudflare

The runtime relationships look like this:

5.2 First-price auctions

	Second-price (legacy)	First-price (current)
Winner pays	Second-highest + $0.01	Their own bid
DSP strategy	Bid truthfully	Bid shade (0.5–0.85 × true value)
Exchange complexity	Higher (track top-2 bids)	Lower (track max bid)
Transparency	Low (exchanges could manipulate)	High
2024+ adoption	Declining	Dominant

5.3 OpenRTB 2.6

Object	Purpose	Key Fields
`BidRequest`	Top-level request from exchange to DSP	`id`, `imp[]`, `site`/`app`, `user`, `device`, `regs`, `tmax`
`Imp`	One ad slot	`id`, `banner`/`video`/`native`, `bidfloor`, `pmp`
`Site`/`App`	Publisher context	`domain`, `page`, `cat[]`, `publisher`
`User`	User targeting	`id`, `buyeruid`, `geo`, `data[]`, `consent`
`Device`	Device info	`ua`, `ip`, `geo`, `devicetype`, `os`
`BidResponse`	DSP's response	`id`, `seatbid[]`, `cur`
`Bid`	Individual bid	`id`, `impid`, `price`, `adm` (creative markup), `crid`, `adomain[]`

5.4 The components inside the exchange

5.5 Storage

Store	Technology	Rationale
Enrichment hot-path cache	In-process LRU (ristretto)	5-second TTL. Serves > 95% of enrichment reads with zero network hops.
Enrichment cold-path	Valkey Cluster	Sub-ms reads on cache miss. Sharded by user ID. Background-synced from PostgreSQL + event streams.
Auction event log (sampled)	Kafka	Durable event stream. 100% of impressions and winning bids; 1% sample of losing bids.
Real-time analytics & billing	ClickHouse	Columnar analytics on billions of rows. Sub-second aggregation for dashboards.
Exchange configuration	PostgreSQL	DSP configs, publisher settings, SSP registrations, settlement records. Read replicas for auction servers.
Bid-level archive	S3 / Iceberg (Parquet)	Long-term storage of winning bids and sampled losses. For billing disputes and ML training.
Creative assets	S3 + CloudFront	Originless serving via CDN with > 95% cache hit rate.

5.6 Why Go

5.7 Why ClickHouse

6. High-Level Architecture

6.1 Multi-region bird's eye

6.2 Decisions worth flagging

CDN-first creative delivery: creatives live on S3 and get pushed to CloudFront. The auction server never touches creative bytes; it returns markup pointing at a CDN URL.

6.3 Auction flow, happy path

6.4 Auction flow, timeout

7. Back-of-the-Envelope Sizing

Every number here is rounded so you can redo the math in your head if you want.

7.1 Request volume

Sustained:  1,000,000 QPS
Peak:       2,000,000 QPS (US evening prime time)
Design for: 1,500,000 QPS with headroom

Per day:    1M × 86,400 ≈ 86 billion bid requests/day
Fill rate:  30%
Impressions: 86B × 0.30 ≈ 26 billion/day
            ≈ 300,000 impressions/sec

7.2 DSP fan-out

Naive (fan out to all 50 DSPs):  1M × 50 = 50M bid requests/sec
Top-5 smart selection:           1M × 5  =  5M bid requests/sec

Bid request:  ~1 KB (OpenRTB JSON, gzipped on the wire)
Bid response: ~0.5 KB

Outbound: 5M × 1 KB   = 5 GB/sec
Inbound:  5M × 0.5 KB = 2.5 GB/sec
Total:    ~7.5 GB/sec across all auction servers

7.3 Auction server sizing

Per-auction latency budget:
  LRU cache hit (95%):   0.1 ms
  Valkey cold (5%):      1.0 ms  (amortized 0.05 ms)
  Pre-bid filter:        1.0 ms
  DSP selection:         1.0 ms
  DSP fan-out (parallel, 60 ms timeout): 40 ms avg
  Auction logic:         0.5 ms
  Macro sub + response:  1.0 ms
  Total p50:             ~45 ms
  Total p99:             ~80 ms

Per pod (c6g.4xlarge, 16 vCPU, 32 GB RAM):
  Concurrent in-flight auctions: ~2,000
  QPS per pod:                   ~25,000

Pods needed:
  Sustained 1M / 25K = 40 pods
  Peak 2M / 25K      = 80 pods
  Deploy 100 pods across 3 regions (40 US + 30 EU + 30 APAC) with HPA to 2x

7.4 Cache and Valkey

Enrichment lookups per auction: 3 logical keys
  - user consent + cookie-sync (1 hash)
  - IP/UA fraud flags (1 set membership)
  - DSP credit-limit flags (1 hash)

At 1M QPS:
  LRU hits (95%):   2.85M logical lookups/sec in-process, zero network
  Valkey cold (5%): 150K ops/sec, trivial for a Valkey cluster

Valkey working set:
  Active users (30-day):     ~200 million
  Per-user entry:            ~150 bytes (consent + cookie sync)
  Total users:               200M × 150 = 30 GB
  Fraud lists (IPs + UAs):   ~1 GB
  DSP credit state:          negligible
  Creative dedup counters:   ~10 GB
  Total:                     ~42 GB

Valkey cluster: 3 primaries (16 GB each) + 3 replicas = 6 nodes per region.

7.5 Kafka (with sampling)

Event topics:
  impressions:      300K/sec × 500 bytes = 150 MB/sec
  clicks:             3K/sec × 300 bytes =   1 MB/sec
  viewability:      300K/sec × 200 bytes =  60 MB/sec
  winning_bids:     300K/sec × 800 bytes = 240 MB/sec
  losing_bids (1%):  50K/sec × 800 bytes =  40 MB/sec

  Total:            ~490 MB/sec
  × replication 3  = 1.5 GB/sec write throughput

Per day:
  490 MB/sec × 86,400 = 42 TB/day ingested
  × zstd 4x compression ≈ 10 TB/day on disk
  Hot retention 3 days = 30 TB

Kafka cluster (per region): 10 brokers × 4 TB NVMe = 40 TB
  ~500 MB/sec ingest per region fits at <30% capacity.

The unsampled bid stream gets teed directly to S3/Iceberg through Kafka Connect, so durable long-term storage doesn't sit on hot Kafka brokers.

7.6 ClickHouse

Ingest rate:
  impressions:  300K rows/sec
  clicks:         3K rows/sec
  winning_bids: 300K rows/sec (separate table)
  Total:       ~600K rows/sec

Row sizes (after compression):
  impression row: ~60 bytes compressed
  Per day:        26B × 60 = 1.5 TB/day compressed
  90-day hot:     135 TB

Cluster (per region):
  4 shards × 3 replicas = 12 nodes
  Each: r6g.4xlarge, 4 TB NVMe
  Total: 48 TB per region
  TTL moves > 30-day data to S3 tiered storage.

7.7 CDN

300K impressions/sec × 200 KB avg creative = 60 GB/sec egress
CDN cache hit rate > 95% → origin pulls < 3 GB/sec
Daily egress: 60 GB/sec × 86,400 ≈ 5 PB/day

Unique creatives: ~500K (top 1% serve 60% of requests, heavy head and long tail)
Creative total storage on origin: 500K × 200 KB = 100 GB
CDN POP cache: ~10 GB hot working set per POP

7.8 Summary

Resource	Number
Auction server pods (global)	100
Valkey nodes (global, 3 × 6)	18
Kafka brokers (global, 3 × 10)	30
ClickHouse nodes (global, 3 × 12)	36
Outbound DSP bandwidth	~7.5 GB/sec
CDN egress	~60 GB/sec
Monthly AWS + CDN bill (rough)	$8–12M
Monthly revenue at 10% take rate	~$400M

8. Data Model

8.1 Auction state machine

8.2 Core tables (PostgreSQL)

The exchange stores publisher settings, DSP configs, SSP registrations, and billing records. Campaigns, budgets, creatives, and frequency caps don't appear here; those live in DSPs.

sql

CREATE TABLE dsp_configurations (
    id                   UUID PRIMARY KEY DEFAULT gen_random_uuid(),
    dsp_name             VARCHAR(100) NOT NULL UNIQUE,
    bid_endpoint         TEXT NOT NULL,
    win_notice_endpoint  TEXT,
    max_qps              INT NOT NULL DEFAULT 100000,
    timeout_ms           INT NOT NULL DEFAULT 60,
    allowed_categories   TEXT[],
    allowed_geos         TEXT[],
    allowed_formats      TEXT[],
    seat_id              VARCHAR(50),
    circuit_breaker      JSONB NOT NULL DEFAULT '{"err_threshold": 0.5, "timeout_threshold": 0.3, "window_sec": 60, "cooldown_sec": 30}',
    historical_win_rate  DECIMAL(5,4) DEFAULT 0,
    enabled              BOOLEAN NOT NULL DEFAULT true,
    created_at           TIMESTAMPTZ NOT NULL DEFAULT now(),
    updated_at           TIMESTAMPTZ NOT NULL DEFAULT now()
);

CREATE TABLE publishers (
    id                    UUID PRIMARY KEY DEFAULT gen_random_uuid(),
    name                  VARCHAR(255) NOT NULL,
    domain                VARCHAR(255) NOT NULL UNIQUE,
    ssp_id                UUID REFERENCES ssp_configurations(id),
    floor_price_cents     INT NOT NULL DEFAULT 50,
    blocked_categories    TEXT[],
    blocked_advertisers   TEXT[],
    ads_txt_verified      BOOLEAN NOT NULL DEFAULT false,
    revenue_share_pct     DECIMAL(5,2) NOT NULL DEFAULT 85.00,
    created_at            TIMESTAMPTZ NOT NULL DEFAULT now()
);

CREATE TABLE billing_settlements (
    id                        UUID PRIMARY KEY DEFAULT gen_random_uuid(),
    settlement_date           DATE NOT NULL,
    dsp_id                    UUID NOT NULL REFERENCES dsp_configurations(id),
    publisher_id              UUID REFERENCES publishers(id),
    impressions               BIGINT NOT NULL,
    clicks                    BIGINT NOT NULL,
    gross_spend_cents         BIGINT NOT NULL,
    exchange_fee_cents        BIGINT NOT NULL,
    publisher_payout_cents    BIGINT NOT NULL,
    dsp_reported_impressions  BIGINT,
    discrepancy_pct           DECIMAL(5,4),
    status                    VARCHAR(20) NOT NULL DEFAULT 'PENDING',
    created_at                TIMESTAMPTZ NOT NULL DEFAULT now()
);

CREATE INDEX idx_settlements_dsp ON billing_settlements(dsp_id, settlement_date);
CREATE INDEX idx_settlements_pub ON billing_settlements(publisher_id, settlement_date);

ssp_configurations and creative_audit_log follow the same pattern.

8.3 Event schemas (Kafka → ClickHouse)

The impressions table is the billing source of truth.

sql

CREATE TABLE impressions (
    impression_id       String,
    auction_id          String,
    timestamp           DateTime64(3),
    dsp_id              String,
    publisher_id        String,
    publisher_domain    String,
    creative_id         String,
    advertiser_domain   String,
    price_cpm           Float64,
    user_id             Nullable(String),
    device_type         Enum8('desktop'=1, 'mobile'=2, 'tablet'=3, 'ctv'=4),
    geo_country         LowCardinality(String),
    geo_region          LowCardinality(String),
    viewable            Nullable(UInt8),
    viewability_pct     Nullable(Float32),
    time_in_view_ms     Nullable(UInt32),
    is_click            UInt8 DEFAULT 0,
    click_timestamp     Nullable(DateTime64(3))
) ENGINE = MergeTree()
PARTITION BY toYYYYMMDD(timestamp)
ORDER BY (dsp_id, publisher_id, timestamp)
TTL timestamp + INTERVAL 90 DAY;

The auction_results table follows the same shape with winning_price_cpm, num_bids_received, num_dsps_selected, and num_dsps_timeout. Writes are 100% for winners and 1% for losers.

8.4 Valkey keyspace

Key	Structure	TTL	Purpose
`user:{uid}:consent`	Hash	30d	TCF string + consent status
`user:{uid}:cookiesync`	Hash	30d	Exchange UID ↔ DSP buyer UIDs
`user:{uid}:creative:{crid}`	Counter	1h	Exchange-level creative dedup
`dsp:{dsp_id}:spend_today`	Hash	until midnight	Spend, credit limit, credit remaining
`dsp:{dsp_id}:circuit`	Hash	5m	Circuit breaker state
`dsp:{dsp_id}:winrate:{vertical}`	Float	1h	Rolling win rate feeding smart selection
`ivt:ip_blocklist`	Set	1h	Known bot IPs
`ivt:asn_reputation`	Hash	1h	ASN reputation scores (datacenter, residential)
`ivt:ua_patterns`	Set	1h	Suspicious UA regex hits
`pub:{pub_id}:config`	Hash	5m	Publisher config cache
`pub:{domain}:adstxt`	Hash	24h	Cached ads.txt entries

Per-campaign frequency caps, campaign budgets, and advertiser targeting rules don't appear here. Those live in DSPs.

9. API Design

9.1 Bid request (exchange to DSP)

POST /openrtb/2.6/bid
Content-Type: application/json
X-OpenRTB-Version: 2.6

json

{
  "id": "auc_01HXYZ123",
  "imp": [{
    "id": "imp_001",
    "banner": {"w": 300, "h": 250, "pos": 1},
    "bidfloor": 0.50,
    "bidfloorcur": "USD"
  }],
  "site": {
    "domain": "espn.com",
    "page": "https://espn.com/nfl/story/cowboys-giants-recap",
    "cat": ["IAB17"],
    "publisher": {"id": "pub_espn", "domain": "espn.com"}
  },
  "user": {
    "id": "uid_user_abc",
    "buyeruid": "ttd_user_xyz",
    "geo": {"country": "USA", "region": "TX", "city": "Austin"},
    "consent": "CPXxRfAPXxRfAAfKAB..."
  },
  "device": {
    "ua": "Mozilla/5.0 (iPhone; CPU iPhone OS 17_0 like Mac OS X)...",
    "ip": "198.51.100.42",
    "devicetype": 4,
    "os": "iOS"
  },
  "regs": {"coppa": 0, "gdpr": 0},
  "tmax": 60,
  "at": 1,
  "cur": ["USD"],
  "source": {
    "ext": {
      "schain": {
        "ver": "1.0",
        "complete": 1,
        "nodes": [{"asi": "adx.google.com", "sid": "pub_espn", "hp": 1}]
      }
    }
  }
}

DSP bid response (200 OK):

json

{
  "id": "auc_01HXYZ123",
  "seatbid": [{
    "bid": [{
      "id": "bid_ttd_001",
      "impid": "imp_001",
      "price": 8.50,
      "adm": "<div id='ad-${AUCTION_ID}'><a href='${CLICK_URL}https://nike.com/pegasus-41'><img src='https://cdn.nike.com/cr/pegasus_300x250.jpg' width='300' height='250'/></a><img src='${IMPRESSION_URL}' style='display:none'/></div>",
      "crid": "cr_nike_pegasus_01",
      "w": 300,
      "h": 250,
      "adomain": ["nike.com"],
      "cat": ["IAB18"]
    }],
    "seat": "seat_nike"
  }],
  "cur": "USD"
}

DSP no-bid: 204 No Content.

9.2 Win notice (exchange to DSP)

POST /win
{
  "auction_id": "auc_01HXYZ123",
  "bid_id": "bid_ttd_001",
  "imp_id": "imp_001",
  "price": 8.50,
  "currency": "USD",
  "timestamp": "2026-05-29T14:32:00.045Z"
}

9.3 Impression tracking

GET /t/imp?auc=auc_01HXYZ123&imp=imp_001&price=enc_xyz&dsp=ttd&pub=pub_espn&crid=cr_nike_pegasus_01
→ 200 OK, Content-Type: image/gif, 43-byte transparent GIF

The endpoint parses query params, decrypts the price token, writes to Valkey for dedup, produces to Kafka asynchronously, and returns the pixel. Target p99 under 5ms.

9.4 Click redirect

GET /t/click?auc=auc_01HXYZ123&imp=imp_001&dest=https%3A%2F%2Fnike.com%2Fpegasus-41
→ 302 Found
Location: https://nike.com/pegasus-41

Destination URLs are validated against a whitelist pattern (allowed schemes, no open-redirect loops) before the 302 is emitted.

9.5 Publisher config API

PUT /v1/publishers/{publisher_id}/config
{
  "floor_price_cents": 75,
  "blocked_categories": ["IAB25", "IAB26"],
  "blocked_advertiser_domains": ["competitor.com"],
  "revenue_share_pct": 85.00
}

9.6 DSP onboarding

POST /v1/dsps
{
  "dsp_name": "Example DSP",
  "bid_endpoint": "https://dsp.example.com/bid",
  "win_notice_endpoint": "https://dsp.example.com/win",
  "max_qps": 100000,
  "timeout_ms": 60,
  "allowed_categories": ["IAB1", "IAB17"],
  "allowed_geos": ["US", "CA"],
  "seat_id": "seat_example_001"
}
→ 201 Created { "id": "...", "status": "SANDBOX", "mtls_cert_url": "..." }

9.7 Settlement report

GET /v1/settlements?dsp_id=ttd&start=2026-06-01&end=2026-06-07
{
  "dsp_id": "ttd",
  "period": {"start": "2026-06-01", "end": "2026-06-07"},
  "totals": {
    "impressions": 1750000000,
    "clicks": 17500000,
    "gross_spend_cents": 14000000000,
    "exchange_fee_cents": 1400000000,
    "publisher_payout_cents": 12600000000,
    "dsp_reported_impressions": 1749640000,
    "discrepancy_pct": 0.0002
  }
}

10. Deep Dives

10.1 RTB auction flow, end to end

Step	Component	Time	Running
1	ESPN page HTML loads, Prebid.js executes	20 ms	20 ms
2	Prebid → Magnite SSP bid request	3 ms	23 ms
3	Magnite → AdX network hop	3 ms	26 ms
4	LRU cache hit: consent, cookie-sync, DSP flags	0.1 ms	26.1 ms
5	Pre-bid fraud filter + ads.txt validate	1 ms	27.1 ms
6	Smart DSP selection (top 5 of 50)	1 ms	28.1 ms
7	DSP fan-out parallel (60 ms timeout, arrives ~40 ms)	40 ms	68.1 ms
8	First-price auction + floor + macro sub	1 ms	69.1 ms
9	AdX → Magnite network hop	3 ms	72.1 ms
10	Magnite → Prebid (selected as winner across SSPs)	3 ms	75.1 ms
11	GAM direct-deal check + render decision	5 ms	80.1 ms
12	CloudFront creative fetch (Austin POP cache hit)	12 ms	92.1 ms
13	Browser renders 300×250	15 ms	107.1 ms
14	Impression pixel fires (async)	2 ms	109.1 ms

The entire page-load-to-ad-visible budget is around 110ms in this example. The exchange's portion is 43ms; the rest is network transit, SSP coordination, GAM decisioning, and browser rendering.

The Go fan-out code:

func (e *Exchange) runAuction(ctx context.Context, req *openrtb.BidRequest) (*AuctionResult, error) {
    ctx, cancel := context.WithTimeout(ctx, 60*time.Millisecond)
    defer cancel()

    // Top-5 smart selection (see §10.2)
    dsps := e.dspSelector.SelectTopN(req, 5)

    bidChan := make(chan *DSPBid, len(dsps))
    for _, dsp := range dsps {
        go func(d *DSPConfig) {
            bid, err := e.sendBidRequest(ctx, d, req)
            if err != nil {
                e.metrics.DSPError(d.ID, err)
                bidChan <- nil
                return
            }
            bidChan <- bid
        }(dsp)
    }

    var bids []*DSPBid
    received := 0
Loop:
    for received < len(dsps) {
        select {
        case bid := <-bidChan:
            received++
            if bid != nil && bid.Price > 0 {
                bids = append(bids, bid)
            }
        case <-ctx.Done():
            e.metrics.AuctionTimeout(len(bids), len(dsps)-received)
            break Loop
        }
    }

    if len(bids) == 0 {
        return &AuctionResult{Filled: false}, nil
    }
    return e.firstPriceAuction(bids, req.Imp[0].BidFloor), nil
}

func (e *Exchange) firstPriceAuction(bids []*DSPBid, floor float64) *AuctionResult {
    var winner *DSPBid
    for _, b := range bids {
        if b.Price < floor {
            continue
        }
        if winner == nil || b.Price > winner.Price {
            winner = b
        }
    }
    if winner == nil {
        return &AuctionResult{Filled: false}
    }
    return &AuctionResult{Filled: true, Winner: winner, Price: winner.Price}
}

10.2 Smart DSP selection

type DSPScore struct {
    DSPID string
    Score float64
}

func (s *DSPSelector) SelectTopN(req *BidRequest, n int) []*DSPConfig {
    geo := req.User.Geo.Country
    format := req.Imp[0].Format()
    vertical := req.Site.Cat[0]

    var candidates []DSPScore
    for _, dsp := range s.registry.All() {
        // Hard filters
        if !dsp.AcceptsGeo(geo) { continue }
        if !dsp.AcceptsFormat(format) { continue }
        if !dsp.AcceptsCategory(vertical) { continue }
        if !dsp.CapacityAvailable() { continue }
        if dsp.CircuitBreakerOpen() { continue }

        // Continuous score
        winRate := dsp.HistoricalWinRate(geo, vertical, format)
        pacing := dsp.PacingFactor()

        candidates = append(candidates, DSPScore{
            DSPID: dsp.ID,
            Score: winRate * pacing,
        })
    }

    sort.Slice(candidates, func(i, j int) bool {
        return candidates[i].Score > candidates[j].Score
    })
    if len(candidates) > n {
        candidates = candidates[:n]
    }

    // 10% exploration: occasionally include a non-top DSP to discover new demand
    if rand.Float64() < 0.10 && len(s.registry.All()) > n {
        explorer := s.registry.RandomExploration(candidates)
        if explorer != nil {
            candidates[n-1] = DSPScore{DSPID: explorer.ID, Score: 0}
        }
    }

    return s.registry.Resolve(candidates)
}

The bandwidth saving from this single change:

Strategy	DSP requests/sec	Outbound bandwidth	DSPs touched
All 50 (naive)	50M	50 GB/sec	50 per auction
Top 5 (smart)	5M	5 GB/sec	5 per auction
Fill-rate delta	—	—	< 2% drop

A 10× cut in DSP-side cost and bandwidth, against a fill-rate hit small enough to be in the noise of normal day-to-day variation.

10.3 Auction types

First-price auctions are what the industry runs now. The winner pays their bid, the math is trivial, and there's nothing for the exchange to manipulate.

10.4 DSP bid optimization

This is opaque to the exchange but worth understanding because DSP behavior is what determines fill rate and per-auction latency. A typical DSP bidder runs through something like:

python

class BidOptimizer:
    def compute_bid(self, req: BidRequest, campaign: Campaign) -> Optional[float]:
        features = self.extract_features(req, campaign)

        # ML: LightGBM or deep learning trained on historical data
        pctr = self.ctr_model.predict(features)
        pcvr = self.cvr_model.predict(features)

        if campaign.bid_strategy == "CPA":
            expected_value = pctr * pcvr * campaign.target_cpa
        elif campaign.bid_strategy == "CPC":
            expected_value = pctr * campaign.max_cpc
        else:  # CPM
            expected_value = campaign.max_cpm / 1000

        # Bid shading for first-price auctions
        shading = self.shading_model.predict(features)  # 0.5-0.85
        bid = expected_value * shading

        # Internal checks invisible to the exchange
        if not self.budget_allows(campaign, bid): return None
        if not self.frequency_allows(req.user_id, campaign.id): return None
        if bid < req.imp[0].bidfloor: return None

        return bid

10.5 Ad server and creative delivery

Macro	Replaced With
`${AUCTION_ID}`	Auction identifier
`${AUCTION_PRICE}`	Encrypted price token (AES-256-GCM)
`${CLICK_URL}`	Exchange click-tracking URL
`${IMPRESSION_URL}`	Exchange impression-pixel URL
`${CACHE_BUSTER}`	Random number to defeat caching on tracking pixels

10.6 Per-DSP spend tracking and settlement

python

class DSPSpendAggregator:
    def process_impression(self, dsp_id: str, imp: ImpressionEvent):
        state = self.get_state(dsp_id)
        state.spend_today_cents += int(imp.price_cpm * 100 / 1000)
        state.impressions_today += 1

        self.valkey.hset(
            f"dsp:{dsp_id}:spend_today",
            mapping={
                "spend_cents":      state.spend_today_cents,
                "impressions":      state.impressions_today,
                "credit_limit":     state.credit_limit_cents,
                "credit_remaining": state.credit_limit_cents - state.spend_today_cents,
                "last_update":      datetime.utcnow().isoformat(),
            }
        )

        if state.spend_today_cents >= state.credit_limit_cents:
            self.valkey.set(f"dsp:{dsp_id}:credit_blocked", "1", ex=3600)
            self.alert(f"DSP {dsp_id} exceeded credit limit")

        if state.spend_today_cents > state.expected_daily * 1.5:
            self.alert(f"DSP {dsp_id} spend anomaly")

10.7 Impression and click tracking

func (t *TrackingService) handleImpression(w http.ResponseWriter, r *http.Request) {
    auctionID := r.URL.Query().Get("auc")
    impID     := r.URL.Query().Get("imp")
    encPrice  := r.URL.Query().Get("price")

    price, err := t.decryptPrice(encPrice)
    if err != nil {
        t.metrics.InvalidPrice.Inc()
        t.servePixel(w); return
    }

    dedupKey := fmt.Sprintf("impression:%s:%s", auctionID, impID)
    isNew, err := t.valkey.SetNX(r.Context(), dedupKey, "1", time.Hour).Result()
    if err != nil {
        // Fail open: better to slightly over-count than lose revenue
        isNew = true
    }

    if isNew {
        t.kafka.ProduceAsync("impressions", auctionID, &ImpressionEvent{
            AuctionID:   auctionID,
            ImpID:       impID,
            Price:       price,
            DSPID:       r.URL.Query().Get("dsp"),
            PublisherID: r.URL.Query().Get("pub"),
            CreativeID:  r.URL.Query().Get("crid"),
            Timestamp:   time.Now(),
            UserAgent:   r.UserAgent(),
            IP:          extractIP(r),
        })
    }

    t.servePixel(w)
}

Click handling is structurally identical, with a 302 Location header instead of a transparent GIF and a destination-URL whitelist check to stop open-redirect abuse.

10.8 Fraud detection beyond IP blocklists

Pre-bid catches roughly 70% of known fraud in a typical month, post-bid another 20%. The remaining 10% is what drives the daily reconciliation work and the per-publisher viewability monitoring.

11. Bottlenecks

There are eight things in this design that could become bottlenecks under load. A few of them only matter in theory; a couple actually bite in practice.

12. Failure Scenarios

12.1 Valkey cluster failure

12.2 DSP unresponsive

12.3 Kafka degradation

12.4 CDN origin failure

12.5 Flink spend aggregator crash

sql

SELECT dsp_id, sum(price_cpm)/1000 AS spend_usd
FROM impressions
WHERE timestamp >= today()
GROUP BY dsp_id

During the ~30-second bootstrap window, auction servers rely on the last DSP credit flags Valkey had. Some DSPs may overspend by a few dollars; the daily reconciliation catches and bills it.

12.6 Auction server OOM

12.7 Shedding load when things back up

13. Deployment

13.1 Multi-region layout

Region	Auction pods	Valkey	Kafka	ClickHouse	Purpose
us-east-1	40	6	10	12	Primary, North America
eu-west-1	30	6	10	12	Europe (GDPR strict mode)
ap-south-1	30	6	10	12	Asia-Pacific
Global (S3, CDN, PG)	—	—	—	—	Shared storage, config, creative CDN

13.2 Pipeline

Canary fail thresholds: auction p99 over 110ms, fill-rate drop over 2%, DSP timeout rate up by more than 5 percentage points, 5xx rate over 0.1%, RPM down more than 3%.

13.3 Rollback

Component	Method	Time
Auction server code	k8s rolling update to previous image	< 5 min
DSP config	Revert in PG, publish Kafka config event	< 30 sec
Flink spend job	Redeploy previous JAR from S3 checkpoint	< 2 min
Tracking endpoint	k8s rolling update	< 3 min
ClickHouse schema	Forward-only, columns added backward-compatible	N/A
Publisher config	Revert via API	Immediate

14. Observability

14.1 Key metrics

Metric	Type	Alert threshold
`auction.qps`	Counter	< 700K (30% below baseline) or > 1.8M (spike)
`auction.latency.p50`	Histogram	> 50 ms
`auction.latency.p99`	Histogram	> 100 ms
`auction.fill_rate`	Gauge	< 25%
`auction.revenue_per_1k`	Gauge	> 10% drop from 1h MA
`dsp_selection.top_n_time`	Histogram	> 2 ms
`dsp.{id}.response_time.p99`	Histogram	> 55 ms
`dsp.{id}.nobid_rate`	Gauge	> 95%
`dsp.{id}.circuit_breaker`	Gauge	state = OPEN
`dsp.{id}.spend_today`	Gauge	> 90% of credit limit
`tracking.impression.qps`	Counter	< 250K
`tracking.dedup.rate`	Gauge	> 5%
`lru.hit_rate`	Gauge	< 90%
`valkey.ops_per_sec`	Counter	> 1M (capacity alarm)
`valkey.latency.p99`	Histogram	> 2 ms
`kafka.consumer_lag.impressions`	Gauge	> 100K events
`clickhouse.query.p99`	Histogram	> 10 s
`cdn.cache_hit_rate`	Gauge	< 90%
`settlement.discrepancy`	Gauge	> 0.01%
`ivt.blocked_rate`	Gauge	> 15%
`load_shed.rate`	Gauge	> 1% (load-shedding kicking in)

14.2 Dashboard

┌────────────────────────────────────────────────────────┐
│  Auction QPS         │  Latency (p50/p99)             │
│  1.05M [live]        │  42ms / 78ms                   │
├────────────────────────────────────────────────────────┤
│  Fill Rate           │  Revenue $/hour                 │
│  31.8% [24h]         │  $4.2M [24h]                    │
├────────────────────────────────────────────────────────┤
│  DSP Response Matrix (top 10)                          │
│  TTD:    32ms ok │ DV360: 38ms ok │ Amazon: 44ms ok   │
│  Criteo: 41ms ok │ Xandr: OPEN    │ Magnite: 28ms ok  │
├────────────────────────────────────────────────────────┤
│  DSP Spend & Credit                                    │
│  TTD:   $1.2M / $5M (24%) [healthy]                    │
│  DV360: $4.8M / $5M (96%) [approaching limit]          │
├────────────────────────────────────────────────────────┤
│  Tracking: imp 310K/s, click 3.1K/s, dedup 1.1%       │
│  LRU hit rate: 96.3% │ Load shed rate: 0.0%            │
└────────────────────────────────────────────────────────┘

14.3 Distributed tracing

Every auction carries a trace ID through the whole lifecycle. OTel spans:

Trace: auc_01HXYZ123 (42ms total)
├── LRU enrichment (0.1ms) [hit]
├── Pre-bid filter (1ms)
├── DSP selection (0.8ms) → [ttd, dv360, amazon, criteo, xandr]
├── DSP fan-out (38ms)
│   ├── ttd     req (32ms) ok bid $8.50
│   ├── dv360   req (38ms) ok bid $5.00
│   ├── amazon  req (44ms) ok bid $4.00
│   ├── criteo  req (41ms) ok bid $6.50
│   └── xandr   req (circuit_open)
├── Auction logic (0.5ms)
├── Macro sub + response (1ms)
├── Kafka publish (async, 2ms after response)
└── [later] Impression pixel received (t+112ms)

14.4 Alerting tiers

Tier	Trigger	Action
P0 (page now)	QPS drop > 50%, fill rate drop > 50%, all DSP circuits open	Page on-call + eng lead
P1 (page 15m)	p99 > 150 ms for 5 m, top-5 DSP circuit open, Kafka lag > 1M	Page on-call
P2 (Slack)	DSP credit > 90%, IVT rate > 15%, CDN hit < 85%, discrepancy > 0.005%	#exchange-ops
P3 (daily)	DSP no-bid rate shift > 10%, fill drop > 5%, creative rejection > 3%	Daily ops review

15. Security

15.1 Data classification

Data	Class	At Rest	In Transit
User IDs (exchange)	Pseudonymous PII	AES-256	TLS 1.3
IP addresses	PII	AES-256, hashed after 30d	TLS 1.3
Consent strings (TCF)	Regulated PII	AES-256	TLS 1.3
Auction bid data	Confidential	AES-256	TLS 1.3
Clearing prices (in markup)	Confidential	AES-256-GCM	TLS 1.3
Creative assets	Public	S3 SSE	TLS 1.3
Configurations	Internal	PG TDE	TLS 1.3
Billing records	Restricted	AES-256	TLS 1.3 + mTLS

15.2 Authentication and authorization

Actor	Auth	Scope
SSPs	mTLS + API key	Bid requests
DSPs	mTLS certificates	Receive bid requests, submit bids, receive win notices
Publishers	OAuth 2.0 + MFA	Config API, revenue dashboards
Internal services	mTLS	Service-to-service
Ops	SSO + MFA (Okta)	Dashboards, DSP config, incident response
Billing / Finance	SSO + MFA + role restriction	Settlement reports, payments

15.3 Price encryption

15.4 Supply chain: ads.txt, sellers.json, schain

15.5 Network policy

yaml

apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
  name: auction-server-policy
spec:
  podSelector:
    matchLabels: { app: auction-server }
  policyTypes: [Ingress, Egress]
  ingress:
  - from: [{ namespaceSelector: { matchLabels: { name: load-balancer } } }]
    ports: [{ port: 8080 }]
  egress:
  - to: [{ namespaceSelector: { matchLabels: { name: data-plane } } }]
    ports: [{ port: 6379 }, { port: 9092 }]  # Valkey + Kafka
  - to: [{ ipBlock: { cidr: 0.0.0.0/0 } }]    # DSPs (external)
    ports: [{ port: 443 }]

15.6 Audit logging

Explore the Technologies

Dive deeper into the technologies and infrastructure patterns used in this design:

Core Technologies

Technology	Role in This Design	Learn More
Valkey	Cold-path enrichment, DSP credit state, fraud lists, creative dedup, circuit breakers	Redis/Valkey
Kafka	Impressions, clicks, winning bids, sampled losing bids, DSP config distribution	Kafka
ClickHouse	Real-time spend dashboards, billing aggregation, analytics	ClickHouse
PostgreSQL	DSP config, publisher settings, SSP registrations, settlement records	PostgreSQL
Flink	Per-DSP spend aggregation, rolling win-rate computation for smart selection	Flink

Infrastructure Patterns

Pattern	Relevance	Learn More
CDN and edge caching	Creative delivery via CloudFront with > 95% hit rate	CDN
Circuit breaker	Per-DSP isolation of slow or failing demand partners	Circuit Breakers
Message queues	Kafka as universal event bus, sampled writes	Message Queues
Load balancing	L4 LB across auction pods at 1M QPS	Load Balancing
Load shedding	Drop low-value auctions to preserve core path	Load Shedding

System Design: Ad Click Aggregator (10B Clicks/day, Lambda Architecture, Fraud Detection)

April 10, 2026 · 73 min read

System Design: E-Commerce Flash Sales (10M Users, Coupon System, One-Per-User Enforcement)

April 5, 2026 · 106 min read

System Design: News Aggregator (100K Sources, Dedup, Personalized Ranking)

April 5, 2026 · 86 min read

Continue Learning

Explore 30+ topics in System Design Interview Prep→

Deep dives, diagrams, and interview-ready knowledge.