Modern PatternsTopic 1 of 6

Modern PatternsIntermediate

API Gateway vs Load Balancer vs Reverse Proxy

HTTPHTTPSTCPgRPC

Load balancers distribute traffic, reverse proxies mediate connections (TLS, caching), and API gateways manage APIs (auth, rate limiting) — most architectures need all three, often in one product.

The Problem

Every production system needs something in front of its backend services: something to distribute traffic, terminate TLS, cache responses, authenticate requests, and rate limit abusive clients. Load balancers, reverse proxies, and API gateways all do parts of this job, and their responsibilities overlap significantly. Understanding where each one starts and stops — and how they compose — is essential for designing reliable architectures.

Mental Model

Like the front of a large hotel — the valet (load balancer) directs cars to parking spots, the doorman (reverse proxy) handles the entrance and security, and the concierge (API gateway) manages guest services, reservations, and special requests.

Architecture Diagram

How It Works

These three concepts sit on a spectrum of network intelligence. At one end, a load balancer makes a simple decision: which backend instance gets this connection? In the middle, a reverse proxy understands the protocol and can terminate TLS, cache responses, and route by URL. At the far end, an API gateway understands the application — it validates auth tokens, enforces rate limits, transforms requests, and exposes developer-facing API management.

The confusion exists because modern tools blur these boundaries. NGINX started as a reverse proxy but handles load balancing. AWS ALB is called a load balancer but routes by HTTP path. Kong is an API gateway but also proxies and load-balances. Understanding the conceptual layers helps engineers make architectural decisions even when a single product handles multiple roles.

Load Balancer: Distributing Traffic

A load balancer's primary job is simple: given N backend instances, decide which one handles each request. This is the fundamental building block of horizontal scaling.

L4 (Transport Layer) Load Balancers operate at TCP/UDP. They see IP addresses and port numbers, nothing else. They cannot inspect HTTP headers, URL paths, or cookies. What they lack in intelligence, they make up in speed — an L4 LB can handle millions of connections per second because it never parses the payload.

L4 Load Balancer decision:
  Input:  TCP SYN to 10.0.0.1:443
  Output: Forward to 10.0.1.{5,6,7}:8080 (round-robin)
  
  Cannot: Route by URL path, inspect headers, cache responses

L7 (Application Layer) Load Balancers understand HTTP. They parse the request, read headers, and make routing decisions based on content. AWS ALB, HAProxy in HTTP mode, and NGINX in upstream configuration are L7 load balancers.

L7 Load Balancer decision:
  Input:  GET /api/v2/users Host: api.example.com
  Output: Route to users-service-v2 pool (based on path prefix)
  
  Can: Route by path, host, header, cookie, query parameter

Load balancing algorithms determine distribution:

Algorithm	How It Works	Best For
Round Robin	Rotate through backends sequentially	Equal-capacity backends
Least Connections	Send to the backend with fewest active connections	Variable request duration
Weighted	Assign proportional traffic by backend weight	Mixed-capacity backends, canary deploys
Consistent Hashing	Hash a key (IP, cookie) to a fixed backend	Session affinity, caching layers
Random with Two Choices	Pick two random backends, choose the less loaded one	Large pools where least-conn is expensive

Reverse Proxy: Mediating Connections

A reverse proxy sits between clients and the backend servers. Clients talk to the proxy, and the proxy talks to backends. Unlike a forward proxy (which hides clients from servers), a reverse proxy hides servers from clients. The client never knows the backend topology.

Key reverse proxy capabilities:

TLS Termination: The proxy holds the SSL certificate and decrypts HTTPS traffic. Backends receive plaintext HTTP, simplifying their configuration and reducing CPU load. This is the single most common reason to deploy a reverse proxy.

# NGINX TLS termination
server {
    listen 443 ssl;
    ssl_certificate     /etc/nginx/cert.pem;
    ssl_certificate_key /etc/nginx/key.pem;
    
    location / {
        proxy_pass http://backend:8080;  # plaintext to backend
    }
}

Response Caching: The proxy caches responses from backends, serving repeat requests without hitting the backend at all. For read-heavy APIs, this can reduce backend load by 80-90%.

Compression: The proxy compresses responses (gzip, brotli) before sending to clients, reducing bandwidth without backend code changes.

Request Routing: Based on URL path, host header, or other request attributes, the proxy routes to different backend pools. /api/* goes to the API servers, /static/* goes to the CDN origin, /admin/* goes to the admin service.

Connection Pooling: Clients open many short-lived connections to the proxy. The proxy maintains a smaller pool of persistent connections to backends, reducing connection establishment overhead.

API Gateway: Managing APIs

An API gateway is a reverse proxy with application-level intelligence. It understands not just HTTP, but the API contracts — who is allowed to call what, how fast, and with what transformations.

Core API gateway functions:

Authentication and Authorization: The gateway validates auth tokens (JWT, API keys, OAuth) before the request reaches any backend. If a token is expired or invalid, the backend never sees the request. This centralizes auth logic instead of reimplementing it in every service.

# Kong API Gateway — JWT authentication plugin
plugins:
  - name: jwt
    config:
      claims_to_verify:
        - exp
      header_names:
        - Authorization
  - name: acl
    config:
      allow:
        - premium-users

Rate Limiting: The gateway enforces request quotas per API key, user, IP, or any custom attribute. This protects backends from abuse, implements tiered pricing (free tier: 100 req/min, paid: 10,000 req/min), and prevents a single client from consuming all capacity.

Request/Response Transformation: The gateway can modify requests before forwarding (add headers, rename fields, merge responses from multiple backends) and modify responses before returning to clients. This is how teams maintain backward compatibility — the gateway translates old API formats to new backend formats.

API Versioning: Route /api/v1/users to the legacy service and /api/v2/users to the new service, managed at the gateway without client changes.

Developer Portal and Analytics: API gateways often include developer-facing features: API documentation, API key issuance, usage dashboards, and deprecation notices.

The Comparison Table

This table shows where responsibilities actually live. Notice the overlap — these are not discrete categories.

Feature	Load Balancer	Reverse Proxy	API Gateway
Traffic distribution	Yes (primary purpose)	Yes	Yes
Health checks	Yes	Yes	Yes
TLS termination	L7 only	Yes (primary purpose)	Yes
Response caching	No	Yes	Sometimes
Compression	No	Yes	Sometimes
Request routing (path/host)	L7 only	Yes	Yes
Authentication	No	Basic (client certs)	Yes (primary purpose)
Rate limiting	No	Basic (per-IP)	Yes (per-key, per-user)
Request transformation	No	Minimal (headers)	Yes (body, headers, path)
API versioning	No	No	Yes
Developer portal	No	No	Yes
L4 (TCP/UDP) support	Yes	Limited	No
Protocol translation	No	No	Yes (REST→gRPC, etc.)

How They Compose in Real Architectures

In a production system, all three layers often appear. Here is a typical cloud-native setup:

Internet
  │
  ▼
┌──────────────────────────┐
│  Cloud Load Balancer (L4) │  ← Distributes across AZs
│  (AWS NLB, GCP TCP LB)   │     Handles millions of connections
└──────────┬───────────────┘
           │
  ┌────────▼────────┐
  │  Reverse Proxy   │  ← TLS termination, caching, compression
  │  (NGINX, Envoy)  │     Static asset serving
  └────────┬────────┘
           │
  ┌────────▼────────┐
  │  API Gateway     │  ← Auth, rate limiting, API versioning
  │  (Kong, custom)  │     Request transformation
  └────────┬────────┘
           │
    ┌──────┼──────┐
    ▼      ▼      ▼
  Svc A  Svc B  Svc C    ← Business logic

However, this three-layer stack is not always necessary. Here are common simplifications:

Small to Medium (< 50 services): A single product like NGINX Plus or Envoy handles all three roles. Configure load balancing, TLS termination, and basic rate limiting in one place.

Cloud-Native: AWS ALB + API Gateway. The ALB handles L7 load balancing, TLS, and path-based routing. API Gateway adds authentication, throttling, and Lambda integration. No self-managed reverse proxy needed.

Kubernetes: An Ingress Controller (NGINX Ingress, Traefik, Envoy-based) acts as reverse proxy and load balancer. Add Kong Ingress Controller or Ambassador when API gateway features are needed. The service mesh handles service-to-service load balancing.

Common Architecture Decisions

When to separate layers vs. combine them:

Separate when: the architecture needs independent scaling (the gateway CPU-bottlenecks on auth token validation while the LB is idle), independent teams manage different layers, or compliance requires a dedicated WAF/API gateway appliance.

Combine when: optimizing for simplicity and operational cost, traffic volume does not require independent scaling, or the team is early-stage and a single NGINX config covers everything.

When an API gateway is needed vs. just a reverse proxy:

If the requirements are TLS termination, caching, and basic load balancing, a reverse proxy (NGINX, Caddy) is sufficient. The moment the system needs per-user authentication, per-key rate limiting, request transformation, or API analytics, an API gateway is necessary.

# Quick test: Does the reverse proxy handle what's needed?
# If any of these are required, add a gateway:
# - Validate JWTs and reject expired tokens
# - Rate limit by API key (not just by IP)
# - Transform request bodies (rename fields, merge payloads)
# - Provide API key self-service to developers
# - Track usage per consumer for billing

# If only these are needed, a reverse proxy is enough:
# - TLS termination
# - Static file serving and caching
# - Path-based routing to different backends
# - Gzip compression
# - Basic IP-based rate limiting

The microservices trap: Do not put a public-facing API gateway in the path of every internal service-to-service call. The gateway is for north-south traffic (external clients to internal services). For east-west traffic (service-to-service), use a service mesh or simple client-side load balancing. An API gateway in the east-west path adds latency and creates a bottleneck.

Key Points

•A load balancer distributes traffic, a reverse proxy mediates it, and an API gateway manages it — they overlap significantly but solve different primary problems.
•Most production architectures use all three, often in a single product: NGINX can be a reverse proxy and load balancer, Kong adds API gateway features on top.
•L4 load balancers (TCP level) are faster but blind to HTTP — they cannot route by URL path, add headers, or do content-based routing.
•API gateways add business logic to the network edge: auth token validation, API key management, request/response transformation, and usage analytics.
•The modern trend is convergence — Envoy, Kong, and cloud ALBs blur the boundaries by offering reverse proxy, load balancing, and gateway features in one product.

Key Components

Component	Role
Load Balancer	Distributes incoming traffic across multiple backend instances using algorithms like round-robin, least connections, or consistent hashing
Reverse Proxy	Sits in front of backend servers to handle TLS termination, caching, compression, and request routing without clients knowing the backend topology
API Gateway	Application-aware entry point that handles authentication, rate limiting, request transformation, API versioning, and developer-facing concerns
Health Check System	Continuously probes backend instances and removes unhealthy ones from the pool — shared across all three components
Control Plane / Config	Management layer where routing rules, rate limits, certificates, and upstream definitions are configured and pushed to the data path

When to Use

Use a load balancer to distribute traffic across multiple instances of the same service. Add a reverse proxy for TLS termination, caching, or compression. Add an API gateway for authentication, rate limiting, request transformation, or API versioning. In practice, start with a single product that covers the requirements (NGINX, Envoy, or cloud ALB) and add specialized layers only when requirements demand it.

Tool Comparison

Tool	Type	Best For	Scale
Kong	Open Source	Full-featured API gateway built on NGINX with plugin ecosystem for auth, rate limiting, and transformations	Medium-Enterprise
AWS API Gateway	Managed	Serverless API management with Lambda integration, usage plans, and API keys — zero infrastructure to manage	Small-Enterprise
NGINX	Open Source	Industry-standard reverse proxy and load balancer with proven performance — the foundation most other tools build on	Small-Enterprise
Envoy	Open Source	Modern L4/L7 proxy with advanced load balancing, observability, and dynamic configuration via xDS — the cloud-native standard	Medium-Enterprise

Debug Checklist

Check which layer is returning errors: is it the LB (502 = no healthy backends), the proxy (504 = upstream timeout), or the gateway (401/403 = auth failure, 429 = rate limited)?
Verify health checks: curl the health check endpoint from the LB's perspective. A healthy app that fails the LB's health check means the check path is wrong.
Inspect request headers through the chain: X-Forwarded-For, X-Real-IP, X-Request-ID should propagate correctly through all layers.
Check TLS certificate at each layer: openssl s_client -connect <host>:<port> at the LB, proxy, and gateway to verify certs are correct and not expired.
Test rate limiting: send requests above the limit and verify the response is 429, not 200 — misconfigured rate limits are invisible until tested.

Common Mistakes

Putting business logic in the API gateway. Rate limiting and auth belong there; order validation and pricing rules belong in the services.
Using an L7 API gateway for TCP/UDP traffic that does not need HTTP-level features — the system pays the parsing overhead for no benefit.
Not understanding the difference between L4 and L7 load balancing. L4 is faster but cannot route by path, host header, or cookie.
Running multiple layers of TLS termination unnecessarily. If the ALB terminates TLS, the API gateway does not need to terminate it again (unless re-encryption is required).
Treating the API gateway as a single point of failure. If it goes down, every API goes down. Always deploy gateways in HA pairs with health checks.

Real World Usage

•Netflix migrated from Zuul (their custom API gateway) to Spring Cloud Gateway, handling billions of API requests daily with authentication, routing, and canary testing.
•Amazon API Gateway processes trillions of API calls per year, integrated with Lambda for serverless backends and providing automatic scaling.
•Cloudflare acts as all three — load balancing across origins, reverse proxying with caching, and API gateway features like rate limiting and WAF — at their global edge.
•Stripe routes all API traffic through an internal API gateway that handles authentication, rate limiting per API key, and request logging before reaching any backend service.
•Shopify uses NGINX as a reverse proxy and load balancer across their Ruby on Rails fleet, handling flash sales with traffic spikes of 100x normal volume.

RFCs & Specs

RFC 7230 — HTTP/1.1 Message Syntax and RoutingRFC 9110 — HTTP SemanticsRFC 6585 — Additional HTTP Status Codes (429 Too Many Requests)RFC 7239 — Forwarded HTTP Extension

API Gateway vs Load Balancer vs Reverse Proxy

HTTPHTTPSTCPgRPC

Load balancers distribute traffic, reverse proxies mediate connections (TLS, caching), and API gateways manage APIs (auth, rate limiting) — most architectures need all three, often in one product.

The Problem

Mental Model

Architecture Diagram

How It Works

Load Balancer: Distributing Traffic

A load balancer's primary job is simple: given N backend instances, decide which one handles each request. This is the fundamental building block of horizontal scaling.

L4 Load Balancer decision:
  Input:  TCP SYN to 10.0.0.1:443
  Output: Forward to 10.0.1.{5,6,7}:8080 (round-robin)
  
  Cannot: Route by URL path, inspect headers, cache responses

L7 Load Balancer decision:
  Input:  GET /api/v2/users Host: api.example.com
  Output: Route to users-service-v2 pool (based on path prefix)
  
  Can: Route by path, host, header, cookie, query parameter

Load balancing algorithms determine distribution:

Algorithm	How It Works	Best For
Round Robin	Rotate through backends sequentially	Equal-capacity backends
Least Connections	Send to the backend with fewest active connections	Variable request duration
Weighted	Assign proportional traffic by backend weight	Mixed-capacity backends, canary deploys
Consistent Hashing	Hash a key (IP, cookie) to a fixed backend	Session affinity, caching layers
Random with Two Choices	Pick two random backends, choose the less loaded one	Large pools where least-conn is expensive

Reverse Proxy: Mediating Connections

Key reverse proxy capabilities:

# NGINX TLS termination
server {
    listen 443 ssl;
    ssl_certificate     /etc/nginx/cert.pem;
    ssl_certificate_key /etc/nginx/key.pem;
    
    location / {
        proxy_pass http://backend:8080;  # plaintext to backend
    }
}

Response Caching: The proxy caches responses from backends, serving repeat requests without hitting the backend at all. For read-heavy APIs, this can reduce backend load by 80-90%.

Compression: The proxy compresses responses (gzip, brotli) before sending to clients, reducing bandwidth without backend code changes.

Connection Pooling: Clients open many short-lived connections to the proxy. The proxy maintains a smaller pool of persistent connections to backends, reducing connection establishment overhead.

API Gateway: Managing APIs

An API gateway is a reverse proxy with application-level intelligence. It understands not just HTTP, but the API contracts — who is allowed to call what, how fast, and with what transformations.

Core API gateway functions:

# Kong API Gateway — JWT authentication plugin
plugins:
  - name: jwt
    config:
      claims_to_verify:
        - exp
      header_names:
        - Authorization
  - name: acl
    config:
      allow:
        - premium-users

API Versioning: Route /api/v1/users to the legacy service and /api/v2/users to the new service, managed at the gateway without client changes.

Developer Portal and Analytics: API gateways often include developer-facing features: API documentation, API key issuance, usage dashboards, and deprecation notices.

The Comparison Table

This table shows where responsibilities actually live. Notice the overlap — these are not discrete categories.

Feature	Load Balancer	Reverse Proxy	API Gateway
Traffic distribution	Yes (primary purpose)	Yes	Yes
Health checks	Yes	Yes	Yes
TLS termination	L7 only	Yes (primary purpose)	Yes
Response caching	No	Yes	Sometimes
Compression	No	Yes	Sometimes
Request routing (path/host)	L7 only	Yes	Yes
Authentication	No	Basic (client certs)	Yes (primary purpose)
Rate limiting	No	Basic (per-IP)	Yes (per-key, per-user)
Request transformation	No	Minimal (headers)	Yes (body, headers, path)
API versioning	No	No	Yes
Developer portal	No	No	Yes
L4 (TCP/UDP) support	Yes	Limited	No
Protocol translation	No	No	Yes (REST→gRPC, etc.)

How They Compose in Real Architectures

In a production system, all three layers often appear. Here is a typical cloud-native setup:

Internet
  │
  ▼
┌──────────────────────────┐
│  Cloud Load Balancer (L4) │  ← Distributes across AZs
│  (AWS NLB, GCP TCP LB)   │     Handles millions of connections
└──────────┬───────────────┘
           │
  ┌────────▼────────┐
  │  Reverse Proxy   │  ← TLS termination, caching, compression
  │  (NGINX, Envoy)  │     Static asset serving
  └────────┬────────┘
           │
  ┌────────▼────────┐
  │  API Gateway     │  ← Auth, rate limiting, API versioning
  │  (Kong, custom)  │     Request transformation
  └────────┬────────┘
           │
    ┌──────┼──────┐
    ▼      ▼      ▼
  Svc A  Svc B  Svc C    ← Business logic

However, this three-layer stack is not always necessary. Here are common simplifications:

Small to Medium (< 50 services): A single product like NGINX Plus or Envoy handles all three roles. Configure load balancing, TLS termination, and basic rate limiting in one place.

Common Architecture Decisions

When to separate layers vs. combine them:

Combine when: optimizing for simplicity and operational cost, traffic volume does not require independent scaling, or the team is early-stage and a single NGINX config covers everything.

When an API gateway is needed vs. just a reverse proxy:

# Quick test: Does the reverse proxy handle what's needed?
# If any of these are required, add a gateway:
# - Validate JWTs and reject expired tokens
# - Rate limit by API key (not just by IP)
# - Transform request bodies (rename fields, merge payloads)
# - Provide API key self-service to developers
# - Track usage per consumer for billing

# If only these are needed, a reverse proxy is enough:
# - TLS termination
# - Static file serving and caching
# - Path-based routing to different backends
# - Gzip compression
# - Basic IP-based rate limiting

Key Points

•A load balancer distributes traffic, a reverse proxy mediates it, and an API gateway manages it — they overlap significantly but solve different primary problems.
•Most production architectures use all three, often in a single product: NGINX can be a reverse proxy and load balancer, Kong adds API gateway features on top.
•L4 load balancers (TCP level) are faster but blind to HTTP — they cannot route by URL path, add headers, or do content-based routing.
•API gateways add business logic to the network edge: auth token validation, API key management, request/response transformation, and usage analytics.
•The modern trend is convergence — Envoy, Kong, and cloud ALBs blur the boundaries by offering reverse proxy, load balancing, and gateway features in one product.

Key Components

Component	Role
Load Balancer	Distributes incoming traffic across multiple backend instances using algorithms like round-robin, least connections, or consistent hashing
Reverse Proxy	Sits in front of backend servers to handle TLS termination, caching, compression, and request routing without clients knowing the backend topology
API Gateway	Application-aware entry point that handles authentication, rate limiting, request transformation, API versioning, and developer-facing concerns
Health Check System	Continuously probes backend instances and removes unhealthy ones from the pool — shared across all three components
Control Plane / Config	Management layer where routing rules, rate limits, certificates, and upstream definitions are configured and pushed to the data path

When to Use

Tool Comparison

Tool	Type	Best For	Scale
Kong	Open Source	Full-featured API gateway built on NGINX with plugin ecosystem for auth, rate limiting, and transformations	Medium-Enterprise
AWS API Gateway	Managed	Serverless API management with Lambda integration, usage plans, and API keys — zero infrastructure to manage	Small-Enterprise
NGINX	Open Source	Industry-standard reverse proxy and load balancer with proven performance — the foundation most other tools build on	Small-Enterprise
Envoy	Open Source	Modern L4/L7 proxy with advanced load balancing, observability, and dynamic configuration via xDS — the cloud-native standard	Medium-Enterprise

Debug Checklist

Check which layer is returning errors: is it the LB (502 = no healthy backends), the proxy (504 = upstream timeout), or the gateway (401/403 = auth failure, 429 = rate limited)?
Verify health checks: curl the health check endpoint from the LB's perspective. A healthy app that fails the LB's health check means the check path is wrong.
Inspect request headers through the chain: X-Forwarded-For, X-Real-IP, X-Request-ID should propagate correctly through all layers.
Check TLS certificate at each layer: openssl s_client -connect <host>:<port> at the LB, proxy, and gateway to verify certs are correct and not expired.
Test rate limiting: send requests above the limit and verify the response is 429, not 200 — misconfigured rate limits are invisible until tested.

Common Mistakes

Putting business logic in the API gateway. Rate limiting and auth belong there; order validation and pricing rules belong in the services.
Using an L7 API gateway for TCP/UDP traffic that does not need HTTP-level features — the system pays the parsing overhead for no benefit.
Not understanding the difference between L4 and L7 load balancing. L4 is faster but cannot route by path, host header, or cookie.
Running multiple layers of TLS termination unnecessarily. If the ALB terminates TLS, the API gateway does not need to terminate it again (unless re-encryption is required).
Treating the API gateway as a single point of failure. If it goes down, every API goes down. Always deploy gateways in HA pairs with health checks.

Real World Usage

•Netflix migrated from Zuul (their custom API gateway) to Spring Cloud Gateway, handling billions of API requests daily with authentication, routing, and canary testing.
•Amazon API Gateway processes trillions of API calls per year, integrated with Lambda for serverless backends and providing automatic scaling.
•Cloudflare acts as all three — load balancing across origins, reverse proxying with caching, and API gateway features like rate limiting and WAF — at their global edge.
•Stripe routes all API traffic through an internal API gateway that handles authentication, rate limiting per API key, and request logging before reaching any backend service.
•Shopify uses NGINX as a reverse proxy and load balancer across their Ruby on Rails fleet, handling flash sales with traffic spikes of 100x normal volume.

RFCs & Specs

RFC 7230 — HTTP/1.1 Message Syntax and RoutingRFC 9110 — HTTP SemanticsRFC 6585 — Additional HTTP Status Codes (429 Too Many Requests)RFC 7239 — Forwarded HTTP Extension

API Gateway vs Load Balancer vs Reverse Proxy

The Problem

Mental Model

Architecture Diagram

How It Works

Load Balancer: Distributing Traffic

Reverse Proxy: Mediating Connections

API Gateway: Managing APIs

The Comparison Table

How They Compose in Real Architectures

Common Architecture Decisions

Key Points

Key Components

When to Use

Tool Comparison

Debug Checklist

Common Mistakes

Real World Usage

RFCs & Specs

Related Topics

API Gateway vs Load Balancer vs Reverse Proxy

The Problem

Mental Model

Architecture Diagram

How It Works

Load Balancer: Distributing Traffic

Reverse Proxy: Mediating Connections

API Gateway: Managing APIs

The Comparison Table

How They Compose in Real Architectures

Common Architecture Decisions

Key Points

Key Components

When to Use

Tool Comparison

Debug Checklist

Common Mistakes

Real World Usage

RFCs & Specs

Related Topics