Envoy Proxy

Why It Exists

Anyone who has run microservices in production knows the pain. The Java services retry with exponential backoff. The Go services use a hand-rolled retry loop. The Python services do not retry at all. Everyone has different timeouts, different circuit breaker logic, different ways of collecting metrics. Now multiply that by 50 services. Good luck debugging a latency spike on a Friday afternoon.

Lyft hit exactly this wall in 2016 and built Envoy. The core idea is simple: pull all the networking concerns out of the application and into a proxy that sits alongside each service. Retries, load balancing, TLS, metrics, tracing. The proxy handles all of it. A Go service and a Java service now behave identically on the network because neither one is making those decisions anymore. Envoy is.

What set Envoy apart from Nginx and HAProxy was that it was built for dynamic configuration from day one. Those older proxies assume a config file is written and reloaded. Envoy assumes config is pushed to it via APIs, and it applies changes immediately. No reload, no restart, no dropped connections. That design choice is what made the whole service mesh wave possible.

How It Works

Proxy Architecture: Envoy is a multi-threaded, non-blocking, event-driven proxy written in C++. Connections flow through a pipeline. Listeners accept incoming connections. Filter Chains process requests through ordered filters (TLS, HTTP codec, routing, rate limiting). Clusters represent upstream service endpoints. Each worker thread runs its own event loop, which keeps lock contention low. This is not a request-per-thread model. It scales well on modern hardware.

xDS APIs: This is the part that takes the most time to learn, and it is also the part that makes Envoy genuinely different. The xDS protocol defines gRPC and REST APIs for each configuration type: LDS (Listener Discovery), RDS (Route Discovery), CDS (Cluster Discovery), EDS (Endpoint Discovery), and SDS (Secret/TLS Discovery). A control plane (Istio, a custom one, or something like go-control-plane) implements these APIs and pushes configuration to Envoy. When routes change, the control plane pushes new RDS config and Envoy applies it immediately. Zero downtime.

HTTP Processing: When an HTTP request hits Envoy, the HTTP connection manager decodes it (HTTP/1.1, HTTP/2, or HTTP/3), runs it through the HTTP filter chain (router, RBAC, rate limit, JWT auth, compression, etc.), matches it to a route, selects an upstream cluster, and forwards the request. Along the way, Envoy generates detailed metrics (request count, latency histograms, error rates per upstream), propagates tracing headers (B3, W3C TraceContext), and writes structured access logs. All of this comes without adding a single library to the application.

Architecture Deep Dive

Service Mesh Pattern: In a service mesh like Istio, Envoy runs as a sidecar container in every pod. Kubernetes network rules (iptables or eBPF) redirect all pod traffic through the local Envoy. Every request, incoming and outgoing, flows through the proxy. The control plane (Istio's istiod) watches Kubernetes services and pushes endpoint updates to all Envoys via EDS. Mutual TLS certificates get automatically provisioned and rotated via SDS. No certificate is ever touched manually.

Load Balancing: Envoy ships with multiple algorithms. Round Robin is the default and good enough for most cases. Least Request routes to the host with the fewest active requests, which is the right choice when backends have variable latency. Ring Hash provides consistent hashing for cache affinity. Maglev is Google's consistent hashing algorithm with better distribution than ring hash. Zone-aware routing prefers local zone backends, which cuts cross-zone latency and saves on egress costs. Pick the algorithm that matches the actual traffic pattern, not the one that sounds most impressive.

Resilience Features: Retries are configurable per route. Set the conditions (5xx, gateway-error, reset, connect-failure), budget limits, and backoff. Get these right or they will amplify outages. Circuit Breaking caps concurrent connections, pending requests, and retries per upstream, which prevents resource exhaustion. Outlier Detection passively watches upstream errors and ejects unhealthy hosts. It reacts faster than active health checks for transient failures, but the defaults are aggressive. Tune them.

Wasm Extensibility: Envoy supports WebAssembly (Wasm) filters for custom logic. Filters can be written in C++, Rust, Go, or AssemblyScript, compiled to Wasm, and loaded at runtime. This is the escape hatch for custom authentication, header manipulation, or protocol handling. It avoids forking Envoy or waiting months for upstream changes. The developer experience is still rough compared to writing native code, but it is getting better.

Google Cloud's Traffic Director uses Envoy as its data plane. Stripe runs all API traffic through Envoy for load balancing and observability. The project has over 25,000 GitHub stars and sits at the foundation of the service mesh ecosystem. It is not going anywhere.

Deployment Patterns

Sidecar proxy: one Envoy per service instance. Maximum isolation, but the highest resource overhead. This is the Istio default and the most common pattern.

Per-node proxy (ambient mesh): one Envoy per Kubernetes node, shared by all pods on that node. Lower overhead, less isolation. Istio's ambient mesh is pushing this model, and it is worth watching if sidecar memory costs are a concern.

Edge proxy: Envoy as the API gateway handling external traffic with TLS termination, authentication, and rate limiting. Many teams start here before adopting sidecars internally.

Most production deployments combine patterns. Edge Envoy for external traffic, sidecar Envoys (or ambient) for inter-service communication. Start with what solves the current problem, and expand when the need arises.

Why It Exists

How It Works

Architecture Deep Dive

Deployment Patterns

Sidecar proxy: one Envoy per service instance. Maximum isolation, but the highest resource overhead. This is the Istio default and the most common pattern.

Edge proxy: Envoy as the API gateway handling external traffic with TLS termination, authentication, and rate limiting. Many teams start here before adopting sidecars internally.

Use Cases

Architecture

Why It Exists

How It Works

Architecture Deep Dive

Deployment Patterns

Pros

Cons

When to use

When NOT to use

Key Points

Common Mistakes

Related Technologies

Envoy Proxy

Use Cases

Architecture

Why It Exists

How It Works

Architecture Deep Dive

Deployment Patterns

Pros

Cons

When to use

When NOT to use

Key Points

Common Mistakes

Related Technologies