Platform ScaleProblem 9 of 29

Platform ScaleAdvanced

API Gateway

Proxy PatternAdapter PatternFacade Pattern

Three structural patterns in one system. Proxy guards the door, Adapter translates protocols, and Facade hides the mess of microservices behind a single clean endpoint.

Key Abstractions

Gateway

Proxy that intercepts all client requests and applies middleware before forwarding

ServiceAdapter

Adapter interface wrapping different backend protocols behind a uniform handle() method

ServiceRegistry

Maps route paths to their target service adapters for request routing

Middleware

Pluggable request processor for auth, rate limiting, logging, and caching

ResponseAggregator

Facade that orchestrates calls to multiple services and returns a single combined response

Class Diagram

The Key Insight

Most engineers think of an API Gateway as a reverse proxy. That is one third of the picture. The real design has three structural patterns stacked on top of each other, each solving a different problem.

The Proxy Pattern handles the "invisible wall" between clients and services. Clients call the gateway using the exact same HTTP interface they would use to call the backend directly. They have no idea auth checks, rate limiting, and caching are happening. That transparency is the whole point of a proxy.

The Adapter Pattern solves protocol heterogeneity. Your microservices are not uniform. One team built their service with REST. Another uses gRPC. The legacy inventory system still speaks SOAP because nobody wants to rewrite it. Without adapters, your gateway needs protocol-specific routing logic for every backend. Adapters wrap each service behind a single handle(request) interface. The gateway routes uniformly.

The Facade Pattern handles response aggregation. A mobile client showing an order details screen needs data from three services. Making three separate API calls over a cellular connection is slow and fragile. The facade makes one call to the gateway, the gateway fans out to three services on a fast internal network, and returns one combined payload.

Requirements

Functional

Route incoming requests to the correct backend service based on path
Support multiple backend protocols (REST, RPC, SOAP/XML) through a uniform interface
Apply a configurable middleware pipeline: authentication, rate limiting, logging, caching
Aggregate responses from multiple services into a single response for the client
Register and deregister services at runtime without code changes

Non-Functional

Middleware execution order must be deterministic and configurable
Rate limiting should be per-client, not global
Cache should only apply to GET requests with 200 responses
Partial failure in aggregation should return available data, not a blanket error

Design Decisions

Why Proxy for the gateway instead of a simple router?

A router forwards requests. That is all it does. A proxy forwards requests while adding behavior that is invisible to both sides. The client sends the same HTTP request it would send to the backend directly. The backend receives what looks like a normal incoming request. Neither side knows about auth checks, rate limiting, request logging, or response caching happening in between.

This matters because cross-cutting concerns multiply fast. Today it is auth and logging. Next quarter it is request tracing, header enrichment, and A/B routing. The proxy pattern lets you stack these transparently. A simple router would force you to either push this logic into every service (duplication) or make clients aware of it (coupling).

Why Adapter instead of making all services speak the same protocol?

You do not control legacy services. The inventory system was built in 2008 and speaks SOAP. Rewriting it is a six-month project nobody will fund. The payments team chose gRPC for performance and they are not switching.

The Adapter pattern wraps each backend behind the same handle(request) -> response interface. The gateway does not care if a service speaks REST, RPC, or carrier pigeon. It calls handle() and gets a Response back. When a new team builds a GraphQL service, you write one new adapter class. Nothing else changes.

Why Facade for aggregation instead of client-side orchestration?

If the mobile app calls OrderService, then PaymentService, then UserService to render one screen, that is three round trips over a cellular connection. Each one adds latency and another failure point.

The ResponseAggregator is a facade that takes a list of paths, calls each service on the fast internal network, and merges everything into one response. The client makes one call. Three round trips become one. And if PaymentService is down, the facade returns user and order data with a clear error for the payment portion instead of failing entirely.

Why a middleware chain instead of putting all checks in the Gateway class?

Imagine auth, rate limiting, logging, and caching all inside the Gateway's handle method. That is four responsibilities in one class. Adding request tracing means editing the Gateway. Removing caching means editing the Gateway. Every change risks breaking something unrelated.

A middleware chain separates each concern into its own class. Each middleware does exactly one thing and calls next_handler to pass control forward. You can add, remove, or reorder middlewares by changing a list. The Gateway class itself stays untouched. Need request tracing next sprint? Write a TracingMiddleware and insert it. Done.

Interview Follow-ups

"How would you handle service discovery?" Replace the static ServiceRegistry with a dynamic one backed by Consul or etcd. Services register themselves on startup and deregister on shutdown. The gateway resolves routes at request time instead of boot time.
"How would you add circuit breaking?" Wrap each ServiceAdapter in a CircuitBreakerAdapter that tracks failure rates. After a threshold, the circuit opens and returns a fallback response without calling the backend. This prevents cascading failures when a downstream service is struggling.
"How would you handle WebSocket connections?" WebSockets need a persistent connection, not request-response routing. Add a WebSocketProxy alongside the HTTP Gateway that upgrades connections and maintains session affinity to the correct backend. The middleware pipeline still applies on the initial handshake.
"How would you implement canary deployments?" Add a RoutingMiddleware that reads a percentage header or user segment and forwards to either the stable or canary adapter for the same service. The ServiceRegistry holds both adapters under different keys, and the middleware picks one based on the routing rules.

Code Implementation

  1  from __future__ import annotations
  2  from abc import ABC, abstractmethod
  3  from dataclasses import dataclass, field
  4  from typing import Callable
  5  import json
  6  import time
  7  
  8  
  9  @dataclass
 10  class Request:
 11      path: str
 12      method: str = "GET"
 13      headers: dict = field(default_factory=dict)
 14      body: str = ""
 15      params: dict = field(default_factory=dict)
 16  
 17  
 18  @dataclass
 19  class Response:
 20      status_code: int
 21      body: str
 22      headers: dict = field(default_factory=dict)
 23  
 24  
 25  # --------------- Service Adapters (Adapter Pattern) ---------------
 26  
 27  class ServiceAdapter(ABC):
 28      @abstractmethod
 29      def handle(self, request: Request) -> Response:
 30          ...
 31  
 32  
 33  class RestAdapter(ServiceAdapter):
 34      """Adapter for REST-based backend services."""
 35      def __init__(self, service_name: str):
 36          self.service_name = service_name
 37  
 38      def handle(self, request: Request) -> Response:
 39          # Simulates translating to REST and calling the backend
 40          payload = {"source": self.service_name, "protocol": "REST",
 41                     "method": request.method, "path": request.path}
 42          return Response(200, json.dumps(payload))
 43  
 44  
 45  class RpcAdapter(ServiceAdapter):
 46      """Adapter for gRPC-style RPC backend services."""
 47      def __init__(self, service_name: str):
 48          self.service_name = service_name
 49  
 50      def handle(self, request: Request) -> Response:
 51          # Translates the standard request into an RPC call format
 52          payload = {"source": self.service_name, "protocol": "RPC",
 53                     "procedure": f"{request.method}_{request.path.strip('/')}"}
 54          return Response(200, json.dumps(payload))
 55  
 56  
 57  class LegacyAdapter(ServiceAdapter):
 58      """Adapter for SOAP/XML legacy backend services."""
 59      def __init__(self, service_name: str):
 60          self.service_name = service_name
 61  
 62      def handle(self, request: Request) -> Response:
 63          # Wraps the request into a SOAP envelope before forwarding
 64          soap_body = (f"<Envelope><Body><{self.service_name}Request>"
 65                       f"<path>{request.path}</path>"
 66                       f"</{self.service_name}Request></Body></Envelope>")
 67          payload = {"source": self.service_name, "protocol": "SOAP",
 68                     "translated_body": soap_body}
 69          return Response(200, json.dumps(payload))
 70  
 71  
 72  # --------------- Service Registry ---------------
 73  
 74  class ServiceRegistry:
 75      def __init__(self):
 76          self._routes: dict[str, ServiceAdapter] = {}
 77  
 78      def register(self, path: str, adapter: ServiceAdapter) -> None:
 79          self._routes[path] = adapter
 80  
 81      def get_adapter(self, path: str) -> ServiceAdapter | None:
 82          # Exact match first, then prefix match
 83          if path in self._routes:
 84              return self._routes[path]
 85          for route, adapter in self._routes.items():
 86              if path.startswith(route):
 87                  return adapter
 88          return None
 89  
 90  
 91  # --------------- Middleware (Chain of Responsibility) ---------------
 92  
 93  # A handler is anything callable that takes a Request and returns a Response
 94  Handler = Callable[[Request], Response]
 95  
 96  
 97  class Middleware(ABC):
 98      @abstractmethod
 99      def process(self, request: Request, next_handler: Handler) -> Response:
100          ...
101  
102  
103  class AuthMiddleware(Middleware):
104      def process(self, request: Request, next_handler: Handler) -> Response:
105          if "Authorization" not in request.headers:
106              return Response(401, "Unauthorized: missing Authorization header")
107          return next_handler(request)
108  
109  
110  class RateLimitMiddleware(Middleware):
111      def __init__(self, max_requests: int = 5, window_seconds: float = 60.0):
112          self.max_requests = max_requests
113          self.window_seconds = window_seconds
114          self._counts: dict[str, list[float]] = {}
115  
116      def process(self, request: Request, next_handler: Handler) -> Response:
117          client = request.headers.get("Authorization", "anonymous")
118          now = time.time()
119          timestamps = self._counts.setdefault(client, [])
120          # Evict old timestamps outside the window
121          timestamps[:] = [t for t in timestamps if now - t < self.window_seconds]
122          if len(timestamps) >= self.max_requests:
123              return Response(429, "Rate limit exceeded")
124          timestamps.append(now)
125          return next_handler(request)
126  
127  
128  class LoggingMiddleware(Middleware):
129      def process(self, request: Request, next_handler: Handler) -> Response:
130          print(f"  [LOG] {request.method} {request.path}")
131          response = next_handler(request)
132          print(f"  [LOG] Response status: {response.status_code}")
133          return response
134  
135  
136  class CacheMiddleware(Middleware):
137      def __init__(self):
138          self._cache: dict[str, Response] = {}
139  
140      def process(self, request: Request, next_handler: Handler) -> Response:
141          if request.method == "GET":
142              cache_key = request.path
143              if cache_key in self._cache:
144                  print(f"  [CACHE] Hit for {cache_key}")
145                  return self._cache[cache_key]
146              response = next_handler(request)
147              if response.status_code == 200:
148                  self._cache[cache_key] = response
149              return response
150          return next_handler(request)
151  
152  
153  # --------------- Response Aggregator (Facade Pattern) ---------------
154  
155  class ResponseAggregator:
156      def aggregate(self, paths: list[str], gateway: "Gateway",
157                    base_request: Request) -> Response:
158          results = {}
159          for path in paths:
160              req = Request(path=path, method="GET",
161                            headers=dict(base_request.headers))
162              resp = gateway.handle(req)
163              if resp.status_code == 200:
164                  try:
165                      results[path] = json.loads(resp.body)
166                  except json.JSONDecodeError:
167                      results[path] = resp.body
168              else:
169                  results[path] = {"error": resp.body, "status": resp.status_code}
170          return Response(200, json.dumps(results, indent=2))
171  
172  
173  # --------------- Gateway (Proxy Pattern) ---------------
174  
175  class Gateway:
176      def __init__(self):
177          self.registry = ServiceRegistry()
178          self._middlewares: list[Middleware] = []
179  
180      def add_middleware(self, middleware: Middleware) -> None:
181          self._middlewares.append(middleware)
182  
183      def handle(self, request: Request) -> Response:
184          # Build the middleware chain from inside out.
185          # The innermost handler forwards to the actual service adapter.
186          def forward(req: Request) -> Response:
187              adapter = self.registry.get_adapter(req.path)
188              if adapter is None:
189                  return Response(404, f"No service registered for {req.path}")
190              return adapter.handle(req)
191  
192          handler = forward
193          for mw in reversed(self._middlewares):
194              # Capture mw in closure properly
195              handler = self._wrap(mw, handler)
196          return handler(request)
197  
198      @staticmethod
199      def _wrap(mw: Middleware, next_handler: Handler) -> Handler:
200          def wrapped(req: Request) -> Response:
201              return mw.process(req, next_handler)
202          return wrapped
203  
204  
205  if __name__ == "__main__":
206      # 1. Create gateway and register services with different adapters
207      gw = Gateway()
208      gw.registry.register("/users", RestAdapter("UserService"))
209      gw.registry.register("/payments", RpcAdapter("PaymentService"))
210      gw.registry.register("/inventory", LegacyAdapter("InventoryService"))
211  
212      # 2. Add middleware chain: logging -> auth -> rate_limit -> cache
213      #    Execution order: logging first, then auth, then rate limit, then cache
214      gw.add_middleware(LoggingMiddleware())
215      gw.add_middleware(AuthMiddleware())
216      gw.add_middleware(RateLimitMiddleware(max_requests=3, window_seconds=60))
217      gw.add_middleware(CacheMiddleware())
218  
219      print("=== Test 1: Authenticated GET /users (REST adapter) ===")
220      resp = gw.handle(Request(path="/users", method="GET",
221                               headers={"Authorization": "Bearer token123"}))
222      print(f"  Status: {resp.status_code}")
223      print(f"  Body: {resp.body}\n")
224  
225      print("=== Test 2: Unauthenticated request (rejected by auth) ===")
226      resp = gw.handle(Request(path="/users", method="GET"))
227      print(f"  Status: {resp.status_code}")
228      print(f"  Body: {resp.body}\n")
229  
230      print("=== Test 3: Authenticated GET /payments (RPC adapter) ===")
231      resp = gw.handle(Request(path="/payments", method="GET",
232                               headers={"Authorization": "Bearer token123"}))
233      print(f"  Status: {resp.status_code}")
234      print(f"  Body: {resp.body}\n")
235  
236      print("=== Test 4: Authenticated GET /inventory (Legacy SOAP adapter) ===")
237      resp = gw.handle(Request(path="/inventory", method="GET",
238                               headers={"Authorization": "Bearer token123"}))
239      print(f"  Status: {resp.status_code}")
240      print(f"  Body: {resp.body}\n")
241  
242      print("=== Test 5: Cache hit on repeated GET /users ===")
243      resp = gw.handle(Request(path="/users", method="GET",
244                               headers={"Authorization": "Bearer token123"}))
245      print(f"  Status: {resp.status_code}")
246      print(f"  Body: {resp.body}\n")
247  
248      print("=== Test 6: Rate limit exceeded (4th request, limit is 3) ===")
249      resp = gw.handle(Request(path="/payments", method="GET",
250                               headers={"Authorization": "Bearer token123"}))
251      print(f"  Status: {resp.status_code}")
252      print(f"  Body: {resp.body}\n")
253  
254      print("=== Test 7: Facade aggregation - /users + /payments in one call ===")
255      aggregator = ResponseAggregator()
256      # Use a fresh gateway without rate limit for clean aggregation demo
257      gw2 = Gateway()
258      gw2.registry.register("/users", RestAdapter("UserService"))
259      gw2.registry.register("/payments", RpcAdapter("PaymentService"))
260      gw2.add_middleware(LoggingMiddleware())
261      gw2.add_middleware(AuthMiddleware())
262      gw2.add_middleware(CacheMiddleware())
263      base = Request(path="/", headers={"Authorization": "Bearer token123"})
264      combined = aggregator.aggregate(["/users", "/payments"], gw2, base)
265      print(f"  Status: {combined.status_code}")
266      print(f"  Combined body:\n{combined.body}")

Common Mistakes

✗Putting business logic in the gateway. The gateway routes and guards. Business logic belongs in the services.
✗Hardcoding service URLs. Use a registry so services can be added, removed, or moved without code changes.
✗Running middleware in the wrong order. Rate limiting before auth means unauthenticated users consume your rate limit budget.
✗Not handling partial failures in aggregation. If one of three services fails, the facade should return partial data, not crash.

Key Points

✓The Gateway is a proxy: same interface as backend services, but with added cross-cutting concerns.
✓Adapters let you integrate REST, RPC, and legacy SOAP services without the gateway knowing the difference.
✓Middleware chain is ordered. Auth runs first, then rate limit, then cache check, then forward.
✓Facade aggregation means clients make one call instead of orchestrating three service calls themselves.

Related Designs