CDN & Edge Networking
CDNs cache content at edge PoPs near users, turning 200ms origin requests into 5-20ms edge responses through hierarchical caching, Anycast routing, and edge compute.
The Problem
Serving content to millions of users globally with low latency when origin servers sit in one or two locations creates unacceptable round-trip times for distant users.
Mental Model
Like a franchise restaurant chain — every city has a branch (edge) serving the same menu (cached content), but special orders (dynamic content) go to headquarters (origin).
Architecture Diagram
How It Works
A CDN is conceptually simple: put copies of content on servers close to the users. The implementation, however, involves elegant engineering at global scale.
The Request Flow
When a user in Tokyo requests https://example.com/image.jpg, here's what happens:
1. DNS Resolution via Anycast
The domain resolves to an Anycast IP address. Unlike regular unicast (one IP = one server), Anycast means the same IP is announced from every PoP worldwide. BGP routing automatically sends the user's packets to the nearest PoP. No geo-DNS databases, no region detection — the network itself routes optimally.
2. TLS Termination at the Edge
The edge PoP terminates TLS, meaning the client's TCP+TLS handshake happens with a server 10ms away instead of 200ms away. This alone saves 200-400ms on a cold connection to a distant origin.
3. Cache Lookup
The edge server computes a cache key (typically the URL path + relevant headers) and checks its local cache. Three outcomes:
- HIT: Content is cached and fresh. Return immediately. Total latency: 5-20ms.
- STALE: Content is cached but expired. Serve the stale version while revalidating with origin in the background (
stale-while-revalidate). - MISS: Content is not cached. Forward the request up the chain.
4. Shield Tier (Mid-Tier Cache)
On a cache miss, the edge doesn't go directly to origin. It routes to a shield (or mid-tier) cache — a regional cache shared by multiple edge PoPs. This is critical: without the shield tier, a cache miss at all 300 edge PoPs simultaneously would hit the origin with 300 requests for the same content. The shield collapses these into one origin request.
5. Origin Fetch
If the shield also misses, it fetches from the origin server. The response flows back through shield → edge → client, getting cached at each tier along the way.
Cache-Control: The Contract Between Origin and CDN
Everything about CDN behavior is controlled by HTTP cache headers. Get these wrong and the CDN is either useless (caching nothing) or dangerous (serving stale content).
# Cache for 1 hour at both CDN and browser
Cache-Control: public, max-age=3600
# Cache at CDN for 1 day, but browser only caches for 5 minutes
Cache-Control: public, max-age=300, s-maxage=86400
# Serve stale while revalidating in background (best for dynamic content)
Cache-Control: public, max-age=60, stale-while-revalidate=300
# Never cache (but the CDN still terminates TLS and provides DDoS protection)
Cache-Control: private, no-store
The s-maxage directive is CDN-specific — it overrides max-age for shared caches (CDN, reverse proxy) while letting the browser cache with a different TTL. This enables aggressive CDN caching while keeping browser caches short for faster updates.
Cache Key Design
The cache key determines what's a "hit" and what's a "miss." The default key is usually the full URL including query string. This means:
/api/products?page=1 → cache entry A
/api/products?page=2 → cache entry B
/api/products?page=1&_t=12345 → cache entry C (cache-busted!)
That last one is a common mistake — cache-busting query params create new cache entries for identical content. Conversely, if the API returns different content based on the Accept-Language header but the cache key doesn't include it, users get the wrong language.
Most CDNs support custom cache keys. Cloudflare's Cache Rules, for example, allow including/excluding specific query params and headers.
Purge Strategies
Cached content eventually needs to be invalidated. There are three approaches:
TTL-Based Expiry
Set a max-age and let content expire naturally. Simple and reliable, but there's a window where stale content is served. For most use cases, a TTL of 5-60 minutes with stale-while-revalidate is the right trade-off.
On-Demand Purge
Actively remove content from the cache when it changes. CDNs offer different purge capabilities:
| CDN | Purge Speed | Purge Types |
|---|---|---|
| Fastly | <150ms global | URL, surrogate key, all |
| Cloudflare | <30s typical | URL, tag, prefix, all |
| CloudFront | 1-10 minutes | URL path pattern, all |
| Akamai | seconds-minutes | URL, CP code, tag |
Surrogate keys (Fastly) / cache tags (Cloudflare) are the most powerful approach: tag every product page with product-123, and when that product updates, purge by tag. This invalidates only the affected content across all PoPs instantly.
Versioned URLs
The most reliable strategy for static assets: include a content hash in the URL.
/static/app.a3b4c5d6.js → Cache forever (immutable)
/static/app.e7f8g9h0.js → New deploy = new URL = automatic "invalidation"
This is why build tools generate fingerprinted asset filenames. The HTML referencing these assets has a short TTL, but the assets themselves are cached indefinitely.
Edge Compute: Beyond Caching
Modern CDNs are not just caches — they're distributed compute platforms.
Cloudflare Workers runs JavaScript/Wasm at every PoP with sub-millisecond cold starts. Entire applications can run at the edge:
// Cloudflare Worker: A/B test at the edge
export default {
async fetch(request) {
const url = new URL(request.url);
const bucket = request.headers.get('cf-connecting-ip')
.split('.').reduce((a, b) => a + parseInt(b), 0) % 100;
if (bucket < 50) {
url.pathname = '/variant-a' + url.pathname;
} else {
url.pathname = '/variant-b' + url.pathname;
}
return fetch(url);
}
};
This A/B test decision happens in microseconds at the edge, with zero latency to origin. Without edge compute, the alternatives are a server-side decision (adding origin RTT) or a client-side decision (causing layout shift).
Lambda@Edge (AWS) runs Node.js/Python at CloudFront edge locations, typically used for:
- Request rewriting and redirect rules
- Authentication and authorization at the edge
- Dynamic origin selection based on request attributes
- Modifying response headers for security
Production Architecture Patterns
Multi-Origin with Failover
Configure the CDN with a primary and backup origin. If the primary returns 5xx errors or times out, the CDN automatically routes to the backup. This provides origin-level redundancy without client-side logic.
Tiered Caching for APIs
For dynamic API responses that change infrequently:
Cache-Control: public, s-maxage=10, stale-while-revalidate=60
The CDN caches for 10 seconds, and serves stale for up to 60 seconds while fetching fresh content in the background. Users always get a fast response, and the origin sees dramatically less traffic. This pattern works beautifully for product catalogs, search results, and any read-heavy API.
Edge-Side Includes (ESI)
For pages with a mix of static and dynamic content: cache the page shell at the edge and dynamically include fragments. The CDN assembles the final page from cached parts and real-time fragments, reducing origin load while keeping personalized content fresh.
Operational Considerations
Monitoring Cache Effectiveness
Track these metrics:
- Cache hit ratio: Target 85%+ for static content, 50%+ for dynamic. Below this, review cache key design and headers.
- Origin shield hit ratio: Should be 90%+. If not, the shield tier might be misconfigured.
- Bandwidth saved: The delta between edge-served bytes and origin-served bytes. This represents the cost savings.
- Purge latency: How long after a purge until all PoPs serve fresh content. Critical for time-sensitive updates.
Common Debugging Workflow
# Check cache status for a URL
curl -sI https://example.com/page | grep -i "cf-cache-status\|x-cache\|age"
# CF-Cache-Status: HIT → served from edge cache
# CF-Cache-Status: MISS → fetched from origin
# CF-Cache-Status: DYNAMIC → not eligible for caching
# Age: 3600 → cached for 3600 seconds
# Verify which PoP served the request
curl -sI https://example.com | grep -i "cf-ray\|x-amz-cf-pop"
CDNs are not magic — they're a well-understood caching layer with clear rules. Get the Cache-Control headers right, design cache keys correctly, and latency drops 10x for the majority of traffic. Miss the fundamentals and the CDN becomes an expensive passthrough.
Key Points
- •CDNs reduce latency by serving content from the nearest PoP — a cache hit at the edge returns in 5-20ms vs 200-500ms from origin
- •Cache-Control headers are the contract between the origin and the CDN — misconfigured headers are the #1 cause of caching problems
- •The shield/mid-tier cache prevents the thundering herd problem: 300 edge PoPs missing cache simultaneously would send 300 requests to origin
- •Anycast means a single IP address resolves to the nearest edge server — no DNS-based geo-routing needed, and failover is automatic via BGP
- •Edge compute is not just caching — it handles auth checks, A/B tests, header manipulation, and even full API logic at the edge
Key Components
| Component | Role |
|---|---|
| Edge PoP (Point of Presence) | Server location close to end users that caches content and terminates TLS, reducing RTT to single-digit milliseconds |
| Cache Tiers (Edge → Shield → Origin) | Hierarchical caching where edge misses go to a shared mid-tier shield before hitting origin, reducing origin load by 90%+ |
| Anycast Routing | Multiple servers share the same IP address, and BGP routes users to the geographically closest one automatically |
| Cache Key | The unique identifier (URL + headers + query params) that determines whether a request is a cache hit or miss |
| Edge Compute | Runs application logic at the edge PoP (Cloudflare Workers, Lambda@Edge) instead of routing to origin for dynamic decisions |
When to Use
Use a CDN for any public-facing website or API. Static assets are the obvious win, but modern CDNs also accelerate dynamic content through connection reuse to origin, TLS termination at the edge, and edge compute.
Tool Comparison
| Tool | Type | Best For | Scale |
|---|---|---|---|
| Cloudflare | Managed | Developer experience, Workers edge compute, DDoS protection, free tier | Small-Enterprise |
| AWS CloudFront | Managed | Deep AWS integration, Lambda@Edge, S3 origin support | Medium-Enterprise |
| Fastly | Managed | Instant purge (<150ms global), VCL programmability, real-time logging | Medium-Enterprise |
| Akamai | Managed | Largest network (4,000+ PoPs), enterprise security, media delivery | Enterprise |
Debug Checklist
- Check cache status headers in the response — look for cf-cache-status, x-cache, or x-amz-cf-pop to see HIT/MISS/STALE/EXPIRED
- Verify the cache key by inspecting what varies between requests — URL, query params, headers. Use CDN debug headers or logs to see the computed key
- Check TTL configuration — inspect Cache-Control and CDN-specific override rules. A TTL of 0 means the CDN always revalidates with origin
- Inspect Vary header behavior — Vary: Cookie or Vary: Authorization will create per-user cache entries, effectively killing cache hit rate
- Test from multiple locations using tools like CDN-check or curl from different regions to verify content is being served from the nearest PoP
Common Mistakes
- Setting Cache-Control: no-cache on content that could be cached. Many teams cache-bust everything out of fear, negating the entire CDN benefit
- Not understanding the Vary header. Vary: Accept-Encoding is fine, but Vary: Cookie makes every user get a unique cache entry — effectively disabling caching
- Ignoring cache key design. Including session tokens or random query params in cache keys causes 0% hit rate on content that should be cacheable
- Purging cache globally when only specific URLs changed. Use targeted purge by URL or surrogate key, not nuclear purge-all
- Assuming CDN handles dynamic content automatically. Dynamic API responses need explicit caching rules (stale-while-revalidate, short TTLs) or edge compute
Real World Usage
- •Netflix uses Open Connect — custom CDN appliances placed directly inside ISP networks, serving 95% of traffic without traversing the internet
- •Shopify routes all storefront traffic through Cloudflare, caching product pages at the edge and running A/B tests via Workers
- •GitHub serves static assets (avatars, release downloads) through Fastly with instant purge when content updates
- •Cloudflare serves ~20% of all websites, handling 57M+ HTTP requests per second across 300+ cities
- •Discord uses Cloudflare for DDoS protection and edge caching of static assets, handling traffic spikes during viral events