API Gateway vs Load Balancer vs Reverse Proxy
Load balancers distribute traffic, reverse proxies mediate connections (TLS, caching), and API gateways manage APIs (auth, rate limiting) — most architectures need all three, often in one product.
The Problem
Every production system needs something in front of its backend services: something to distribute traffic, terminate TLS, cache responses, authenticate requests, and rate limit abusive clients. Load balancers, reverse proxies, and API gateways all do parts of this job, and their responsibilities overlap significantly. Understanding where each one starts and stops — and how they compose — is essential for designing reliable architectures.
Mental Model
Like the front of a large hotel — the valet (load balancer) directs cars to parking spots, the doorman (reverse proxy) handles the entrance and security, and the concierge (API gateway) manages guest services, reservations, and special requests.
Architecture Diagram
How It Works
These three concepts sit on a spectrum of network intelligence. At one end, a load balancer makes a simple decision: which backend instance gets this connection? In the middle, a reverse proxy understands the protocol and can terminate TLS, cache responses, and route by URL. At the far end, an API gateway understands the application — it validates auth tokens, enforces rate limits, transforms requests, and exposes developer-facing API management.
The confusion exists because modern tools blur these boundaries. NGINX started as a reverse proxy but handles load balancing. AWS ALB is called a load balancer but routes by HTTP path. Kong is an API gateway but also proxies and load-balances. Understanding the conceptual layers helps engineers make architectural decisions even when a single product handles multiple roles.
Load Balancer: Distributing Traffic
A load balancer's primary job is simple: given N backend instances, decide which one handles each request. This is the fundamental building block of horizontal scaling.
L4 (Transport Layer) Load Balancers operate at TCP/UDP. They see IP addresses and port numbers, nothing else. They cannot inspect HTTP headers, URL paths, or cookies. What they lack in intelligence, they make up in speed — an L4 LB can handle millions of connections per second because it never parses the payload.
L4 Load Balancer decision:
Input: TCP SYN to 10.0.0.1:443
Output: Forward to 10.0.1.{5,6,7}:8080 (round-robin)
Cannot: Route by URL path, inspect headers, cache responses
L7 (Application Layer) Load Balancers understand HTTP. They parse the request, read headers, and make routing decisions based on content. AWS ALB, HAProxy in HTTP mode, and NGINX in upstream configuration are L7 load balancers.
L7 Load Balancer decision:
Input: GET /api/v2/users Host: api.example.com
Output: Route to users-service-v2 pool (based on path prefix)
Can: Route by path, host, header, cookie, query parameter
Load balancing algorithms determine distribution:
| Algorithm | How It Works | Best For |
|---|---|---|
| Round Robin | Rotate through backends sequentially | Equal-capacity backends |
| Least Connections | Send to the backend with fewest active connections | Variable request duration |
| Weighted | Assign proportional traffic by backend weight | Mixed-capacity backends, canary deploys |
| Consistent Hashing | Hash a key (IP, cookie) to a fixed backend | Session affinity, caching layers |
| Random with Two Choices | Pick two random backends, choose the less loaded one | Large pools where least-conn is expensive |
Reverse Proxy: Mediating Connections
A reverse proxy sits between clients and the backend servers. Clients talk to the proxy, and the proxy talks to backends. Unlike a forward proxy (which hides clients from servers), a reverse proxy hides servers from clients. The client never knows the backend topology.
Key reverse proxy capabilities:
TLS Termination: The proxy holds the SSL certificate and decrypts HTTPS traffic. Backends receive plaintext HTTP, simplifying their configuration and reducing CPU load. This is the single most common reason to deploy a reverse proxy.
# NGINX TLS termination
server {
listen 443 ssl;
ssl_certificate /etc/nginx/cert.pem;
ssl_certificate_key /etc/nginx/key.pem;
location / {
proxy_pass http://backend:8080; # plaintext to backend
}
}
Response Caching: The proxy caches responses from backends, serving repeat requests without hitting the backend at all. For read-heavy APIs, this can reduce backend load by 80-90%.
Compression: The proxy compresses responses (gzip, brotli) before sending to clients, reducing bandwidth without backend code changes.
Request Routing: Based on URL path, host header, or other request attributes, the proxy routes to different backend pools. /api/* goes to the API servers, /static/* goes to the CDN origin, /admin/* goes to the admin service.
Connection Pooling: Clients open many short-lived connections to the proxy. The proxy maintains a smaller pool of persistent connections to backends, reducing connection establishment overhead.
API Gateway: Managing APIs
An API gateway is a reverse proxy with application-level intelligence. It understands not just HTTP, but the API contracts — who is allowed to call what, how fast, and with what transformations.
Core API gateway functions:
Authentication and Authorization: The gateway validates auth tokens (JWT, API keys, OAuth) before the request reaches any backend. If a token is expired or invalid, the backend never sees the request. This centralizes auth logic instead of reimplementing it in every service.
# Kong API Gateway — JWT authentication plugin
plugins:
- name: jwt
config:
claims_to_verify:
- exp
header_names:
- Authorization
- name: acl
config:
allow:
- premium-users
Rate Limiting: The gateway enforces request quotas per API key, user, IP, or any custom attribute. This protects backends from abuse, implements tiered pricing (free tier: 100 req/min, paid: 10,000 req/min), and prevents a single client from consuming all capacity.
Request/Response Transformation: The gateway can modify requests before forwarding (add headers, rename fields, merge responses from multiple backends) and modify responses before returning to clients. This is how teams maintain backward compatibility — the gateway translates old API formats to new backend formats.
API Versioning: Route /api/v1/users to the legacy service and /api/v2/users to the new service, managed at the gateway without client changes.
Developer Portal and Analytics: API gateways often include developer-facing features: API documentation, API key issuance, usage dashboards, and deprecation notices.
The Comparison Table
This table shows where responsibilities actually live. Notice the overlap — these are not discrete categories.
| Feature | Load Balancer | Reverse Proxy | API Gateway |
|---|---|---|---|
| Traffic distribution | Yes (primary purpose) | Yes | Yes |
| Health checks | Yes | Yes | Yes |
| TLS termination | L7 only | Yes (primary purpose) | Yes |
| Response caching | No | Yes | Sometimes |
| Compression | No | Yes | Sometimes |
| Request routing (path/host) | L7 only | Yes | Yes |
| Authentication | No | Basic (client certs) | Yes (primary purpose) |
| Rate limiting | No | Basic (per-IP) | Yes (per-key, per-user) |
| Request transformation | No | Minimal (headers) | Yes (body, headers, path) |
| API versioning | No | No | Yes |
| Developer portal | No | No | Yes |
| L4 (TCP/UDP) support | Yes | Limited | No |
| Protocol translation | No | No | Yes (REST→gRPC, etc.) |
How They Compose in Real Architectures
In a production system, all three layers often appear. Here is a typical cloud-native setup:
Internet
│
▼
┌──────────────────────────┐
│ Cloud Load Balancer (L4) │ ← Distributes across AZs
│ (AWS NLB, GCP TCP LB) │ Handles millions of connections
└──────────┬───────────────┘
│
┌────────▼────────┐
│ Reverse Proxy │ ← TLS termination, caching, compression
│ (NGINX, Envoy) │ Static asset serving
└────────┬────────┘
│
┌────────▼────────┐
│ API Gateway │ ← Auth, rate limiting, API versioning
│ (Kong, custom) │ Request transformation
└────────┬────────┘
│
┌──────┼──────┐
▼ ▼ ▼
Svc A Svc B Svc C ← Business logic
However, this three-layer stack is not always necessary. Here are common simplifications:
Small to Medium (< 50 services): A single product like NGINX Plus or Envoy handles all three roles. Configure load balancing, TLS termination, and basic rate limiting in one place.
Cloud-Native: AWS ALB + API Gateway. The ALB handles L7 load balancing, TLS, and path-based routing. API Gateway adds authentication, throttling, and Lambda integration. No self-managed reverse proxy needed.
Kubernetes: An Ingress Controller (NGINX Ingress, Traefik, Envoy-based) acts as reverse proxy and load balancer. Add Kong Ingress Controller or Ambassador when API gateway features are needed. The service mesh handles service-to-service load balancing.
Common Architecture Decisions
When to separate layers vs. combine them:
Separate when: the architecture needs independent scaling (the gateway CPU-bottlenecks on auth token validation while the LB is idle), independent teams manage different layers, or compliance requires a dedicated WAF/API gateway appliance.
Combine when: optimizing for simplicity and operational cost, traffic volume does not require independent scaling, or the team is early-stage and a single NGINX config covers everything.
When an API gateway is needed vs. just a reverse proxy:
If the requirements are TLS termination, caching, and basic load balancing, a reverse proxy (NGINX, Caddy) is sufficient. The moment the system needs per-user authentication, per-key rate limiting, request transformation, or API analytics, an API gateway is necessary.
# Quick test: Does the reverse proxy handle what's needed?
# If any of these are required, add a gateway:
# - Validate JWTs and reject expired tokens
# - Rate limit by API key (not just by IP)
# - Transform request bodies (rename fields, merge payloads)
# - Provide API key self-service to developers
# - Track usage per consumer for billing
# If only these are needed, a reverse proxy is enough:
# - TLS termination
# - Static file serving and caching
# - Path-based routing to different backends
# - Gzip compression
# - Basic IP-based rate limiting
The microservices trap: Do not put a public-facing API gateway in the path of every internal service-to-service call. The gateway is for north-south traffic (external clients to internal services). For east-west traffic (service-to-service), use a service mesh or simple client-side load balancing. An API gateway in the east-west path adds latency and creates a bottleneck.
Key Points
- •A load balancer distributes traffic, a reverse proxy mediates it, and an API gateway manages it — they overlap significantly but solve different primary problems.
- •Most production architectures use all three, often in a single product: NGINX can be a reverse proxy and load balancer, Kong adds API gateway features on top.
- •L4 load balancers (TCP level) are faster but blind to HTTP — they cannot route by URL path, add headers, or do content-based routing.
- •API gateways add business logic to the network edge: auth token validation, API key management, request/response transformation, and usage analytics.
- •The modern trend is convergence — Envoy, Kong, and cloud ALBs blur the boundaries by offering reverse proxy, load balancing, and gateway features in one product.
Key Components
| Component | Role |
|---|---|
| Load Balancer | Distributes incoming traffic across multiple backend instances using algorithms like round-robin, least connections, or consistent hashing |
| Reverse Proxy | Sits in front of backend servers to handle TLS termination, caching, compression, and request routing without clients knowing the backend topology |
| API Gateway | Application-aware entry point that handles authentication, rate limiting, request transformation, API versioning, and developer-facing concerns |
| Health Check System | Continuously probes backend instances and removes unhealthy ones from the pool — shared across all three components |
| Control Plane / Config | Management layer where routing rules, rate limits, certificates, and upstream definitions are configured and pushed to the data path |
When to Use
Use a load balancer to distribute traffic across multiple instances of the same service. Add a reverse proxy for TLS termination, caching, or compression. Add an API gateway for authentication, rate limiting, request transformation, or API versioning. In practice, start with a single product that covers the requirements (NGINX, Envoy, or cloud ALB) and add specialized layers only when requirements demand it.
Tool Comparison
| Tool | Type | Best For | Scale |
|---|---|---|---|
| Kong | Open Source | Full-featured API gateway built on NGINX with plugin ecosystem for auth, rate limiting, and transformations | Medium-Enterprise |
| AWS API Gateway | Managed | Serverless API management with Lambda integration, usage plans, and API keys — zero infrastructure to manage | Small-Enterprise |
| NGINX | Open Source | Industry-standard reverse proxy and load balancer with proven performance — the foundation most other tools build on | Small-Enterprise |
| Envoy | Open Source | Modern L4/L7 proxy with advanced load balancing, observability, and dynamic configuration via xDS — the cloud-native standard | Medium-Enterprise |
Debug Checklist
- Check which layer is returning errors: is it the LB (502 = no healthy backends), the proxy (504 = upstream timeout), or the gateway (401/403 = auth failure, 429 = rate limited)?
- Verify health checks: curl the health check endpoint from the LB's perspective. A healthy app that fails the LB's health check means the check path is wrong.
- Inspect request headers through the chain: X-Forwarded-For, X-Real-IP, X-Request-ID should propagate correctly through all layers.
- Check TLS certificate at each layer: openssl s_client -connect <host>:<port> at the LB, proxy, and gateway to verify certs are correct and not expired.
- Test rate limiting: send requests above the limit and verify the response is 429, not 200 — misconfigured rate limits are invisible until tested.
Common Mistakes
- Putting business logic in the API gateway. Rate limiting and auth belong there; order validation and pricing rules belong in the services.
- Using an L7 API gateway for TCP/UDP traffic that does not need HTTP-level features — the system pays the parsing overhead for no benefit.
- Not understanding the difference between L4 and L7 load balancing. L4 is faster but cannot route by path, host header, or cookie.
- Running multiple layers of TLS termination unnecessarily. If the ALB terminates TLS, the API gateway does not need to terminate it again (unless re-encryption is required).
- Treating the API gateway as a single point of failure. If it goes down, every API goes down. Always deploy gateways in HA pairs with health checks.
Real World Usage
- •Netflix migrated from Zuul (their custom API gateway) to Spring Cloud Gateway, handling billions of API requests daily with authentication, routing, and canary testing.
- •Amazon API Gateway processes trillions of API calls per year, integrated with Lambda for serverless backends and providing automatic scaling.
- •Cloudflare acts as all three — load balancing across origins, reverse proxying with caching, and API gateway features like rate limiting and WAF — at their global edge.
- •Stripe routes all API traffic through an internal API gateway that handles authentication, rate limiting per API key, and request logging before reaching any backend service.
- •Shopify uses NGINX as a reverse proxy and load balancer across their Ruby on Rails fleet, handling flash sales with traffic spikes of 100x normal volume.